leptonai: a Python library and CLI for managing and operating AI workloads on NVIDIA DGX Cloud Lepton

What it solves

It provides a unified interface for managing and interacting with the NVIDIA DGX Cloud Lepton platform. It simplifies the process of deploying, managing, and calling AI workloads—such as endpoints, batch jobs, and clusters—directly from Python or a command-line interface (CLI).

How it works

The project consists of a Python library and the lep CLI tool. Users can use the CLI to create and manage resources like endpoints, dev pods, and Ray/Slurm clusters. The Python Client allows users to call deployed endpoints as if they were native Python functions by reading the endpoint's OpenAPI schema. Additionally, it includes "skills" that enable AI agents (like Claude Code or Codex) to operate the platform via natural language commands.

Who it’s for

Developers and AI engineers who use the NVIDIA DGX Cloud Lepton platform to deploy and scale AI models and workloads.

Highlights

Unified CLI: A single lep command for managing endpoints, batch jobs, dev pods, and fine-tuning jobs.
Dynamic Client: A Python client that maps endpoint paths to methods automatically based on OpenAPI schemas.
Agentic Integration: Built-in skills for AI agents to manage workloads via natural language.
Cloud-Native Configuration: Pythonic configuration specs that can be shipped directly to the cloud.

leptonai: a Python library and CLI for managing and operating AI workloads on NVIDIA DGX Cloud Lepton

leptonai: a Python library and CLI for managing and operating AI workloads on NVIDIA DGX Cloud Lepton

What it solves

How it works

Who it’s for

Highlights

Sources