HunyuanVideo-1.5: a lightweight 8.3B parameter video generation model for high-quality synthesis on consumer GPUs
HunyuanVideo-1.5: a lightweight 8.3B parameter video generation model for high-quality synthesis on consumer GPUs
What it solves
HunyuanVideo-1.5 is a lightweight video generation model designed to provide high-quality video synthesis while reducing the computational barriers for developers and creators. It enables the generation of professional-grade videos on consumer-grade GPUs, addressing the need for efficient, high-resolution video creation without requiring massive industrial hardware.
How it works
The project utilizes an 8.3 billion parameter Diffusion Transformer (DiT) combined with a 3D causal VAE for efficient spatial and temporal compression. It employs a Selective and Sliding Tile Attention (SSTA) mechanism to prune redundant data and accelerate inference. To further enhance quality, it includes a video super-resolution (VSR) network that upscales outputs to 1080p. The model supports both text-to-video (T2V) and image-to-video (I2V) generation and can be further optimized using step-distillation for faster generation speeds.
Who it’s for
It is intended for developers, AI researchers, and digital creators who want to generate high-quality videos using accessible hardware (minimum 14GB GPU memory) and those looking to integrate video generation into their own applications via tools like ComfyUI or Diffusers.
Highlights
- Consumer-Grade Accessibility: Runs on NVIDIA GPUs with as little as 14GB of VRAM using model offloading.
- High-Performance Architecture: Uses SSTA to achieve significant speedups in 720p video synthesis.
- Flexible Generation: Supports both Text-to-Video and Image-to-Video workflows across various resolutions.
- Advanced Optimizations: Includes support for FP8 GEMM, cache inference (DeepCache, TeaCache, TaylorCache), and step-distilled models for rapid generation.
- Super-Resolution: Integrated few-step network to upscale videos to 1080p for improved sharpness and texture.
Sources
- undefinedTencent-Hunyuan/HunyuanVideo-1.5