kimodo: a kinematic motion diffusion model for controllable 3D human and robot motion generation

kimodo: a kinematic motion diffusion model for controllable 3D human and robot motion generation

What it solves

Kimodo addresses the difficulty of generating high-quality, controllable 3D human and robot motions. It allows users to create complex animations by combining natural language descriptions with precise kinematic constraints, bridging the gap between high-level text prompts and low-level physical control.

How it works

Kimodo is a kinematic motion diffusion model trained on a massive dataset of 700 hours of optical motion capture. It generates 3D motions for various skeletons (SOMA, G1, SMPL-X) based on text prompts and a set of constraints. These constraints can include full-body pose keyframes, end-effector positions (hands/feet), and 2D paths or waypoints on the ground plane.

Who it’s for

This tool is designed for researchers and developers working in robotics, computer animation, and physical AI, specifically those needing to generate synthetic motion data for training physics-based policies or creating high-fidelity 3D animations.

Highlights

  • Multi-Skeleton Support: Works with SOMA, Unitree G1, and SMPL-X skeletons.
  • Interactive Authoring: Includes a web-based demo with a timeline editor for mixing text prompts and kinematic controls.
  • Robotics Integration: Compatible with MuJoCo for visualization and ProtoMotions for training physics-based policies.
  • Comprehensive Benchmark: Provides a standardized evaluation pipeline and test suite to measure motion quality and constraint following.
  • Flexible Control: Supports a variety of constraints including end-effector control and 2D root trajectories.

Sources