Home » Stanford introduces FramePack for AI video generation

Stanford introduces FramePack for AI video generation

Researchers at Stanford University have developed a new AI framework called FramePack that enables high-quality video generation using gaming GPUs with just 6GB of VRAM. This breakthrough makes AI video generation more accessible to the masses, as it no longer requires expensive subscriptions or high-powered servers. Lvmin Zhang and Maneesh Agrawala introduced FramePack, a practical implementation of video diffusion that leverages fixed-length temporal context for more efficient processing.

This allows for the generation of longer and higher-quality videos. A 13-billion parameter model built using the FramePack architecture can produce a 60-second clip at 30 FPS using only a mid-range GPU. Traditional video diffusion models rely on previously generated frames to predict the next one.

As the video length increases, so does the “temporal context”—the number of past frames the model must consider—resulting in higher memory demands. Most models require 12GB of VRAM or more to run efficiently. FramePack compresses input frames based on importance into a fixed-length context, keeping the memory footprint compact and consistent regardless of video duration.

This innovation allows the model to process thousands of frames even with large architectures on laptop-grade GPUs.

Introduction of FramePack for GPUs

It also enables training with batch sizes comparable to those used in image diffusion models.

FramePack also addresses drifting—a common issue where video quality degrades over time. By using intelligent compression patterns and scheduling techniques, FramePack helps maintain visual consistency from beginning to end. The model includes a user-friendly GUI.

Users can upload images, enter text prompts, and view a live preview as frames are generated. On an RTX 4090, optimized generation speeds reach up to 0.6 frames per second. Performance is lower on less powerful GPUs, but even an RTX 3060 can handle it.

Currently, FramePack supports Nvidia’s RTX 30, 40, and the new 50 series GPUs, provided they support FP16 or BF16 data formats. There’s no confirmed support yet for AMD or Intel GPUs, but the model works across multiple operating systems, including Linux. This breakthrough in AI video generation has the potential to revolutionize the way content creators and casual users produce videos.

With FramePack, creating high-quality AI-generated videos is now more accessible than ever before.

Image Credits: Photo by Jason Leung on Unsplash

Rashan Dixon

Rashan is a seasoned technology journalist and visionary leader serving as the Editor-in-Chief of DevX.com, a leading online publication focused on software development, programming languages, and emerging technologies. With his deep expertise in the tech industry and her passion for empowering developers, Rashan has transformed DevX.com into a vibrant hub of knowledge and innovation. Reach out to Rashan at [email protected]

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.