FastVideo: a unified post-training and real-time inference framework for accelerated video generation

What it solves

FastVideo addresses the high computational cost and slow generation speeds associated with state-of-the-art video generation models. It provides a unified framework to accelerate both the post-training (fine-tuning and distillation) and the real-time inference of video Diffusion Transformers (DiTs).

How it works

FastVideo employs several optimization techniques to reduce latency and increase throughput:

Post-Training Optimizations: It supports full and LoRA fine-tuning, as well as Distribution Matching Distillation (DMD2) and sparse distillation to achieve significant denoising speedups (over 50x).
Attention Mechanisms: It implements specialized attention backends, including Video Sparse Attention (VSA) and Sliding Tile Attention, to reduce the complexity of processing video frames.
Inference Scaling: The framework utilizes sequence parallelism for distributed inference across multiple GPUs and supports various hardware (H100, A100, 4090) and operating systems.
Real-time Streaming: Through its Dreamverse platform, it enables "vibe directing," allowing users to stream and edit video in real-time.

Who it’s for

This framework is designed for AI researchers and developers building high-performance video generation applications who need to reduce inference latency or train/distill specialized video models.

Highlights

Massive Speedups: Capable of generating 5 seconds of video in 1.8 seconds end-to-end using FastWan-QAD.
Comprehensive Tooling: Includes a full data preprocessing pipeline for video, image, and text.
Scalable Training: Supports FSDP2, sequence parallelism, and selective activation checkpointing.
Real-time Interface: Includes Dreamverse, a web UI for real-time video generation and editing.

FastVideo: a unified post-training and real-time inference framework for accelerated video generation

FastVideo: a unified post-training and real-time inference framework for accelerated video generation

What it solves

How it works

Who it’s for

Highlights

Sources