ComfyUI-LTXVideo: custom ComfyUI nodes for advanced LTX-2 video generation and audio synthesis

ComfyUI-LTXVideo: custom ComfyUI nodes for advanced LTX-2 video generation and audio synthesis

What it solves

This project provides a set of custom nodes and workflows for ComfyUI to unlock the full potential of the LTX-2 video generation model. It extends the core ComfyUI LTX-2 integration by adding specialized tools for high-dynamic-range (HDR) video, precise control over motion and structure, and audio generation.

How it works

It integrates as a plugin for ComfyUI, providing nodes that interface with LTX-2 and various specialized LoRAs (Low-Rank Adaptation). These LoRAs allow the model to perform specific tasks like lip-syncing, spatial upscaling, or following depth and edge maps. The project also includes a dedicated audio-only mode that leverages LTX-2's joint audio/video transformer architecture to generate sound from text.

Who it’s for

Content creators, AI video artists, and developers using ComfyUI who want advanced control over LTX-2 video generation, including professional-grade HDR output and generative upscaling.

Highlights

  • Union IC-LoRA: A single unified LoRA that handles both depth and edge (canny) control conditions simultaneously.
  • HDR Video: Support for linear HDR output encoded in ARRI LogC3 with EXR export capabilities.
  • Lipdub: A specialized LoRA for multilingual dubbing and rephrasing speech while preserving speaker identity.
  • Generative Upscaling: Pixel Spatial Upscaler LoRAs that synthesize new fine details at 2x or 4x resolution rather than simple interpolation.
  • Text-to-Audio: Ability to use the model in an audio-only mode to generate audio from text prompts.

Sources