Kiln: a local-first AI development workbench for prompt optimization, evaluations, and agent orchestration
Kiln: a local-first AI development workbench for prompt optimization, evaluations, and agent orchestration
What it solves
Kiln provides a unified workbench for the entire AI development lifecycle, eliminating the need to switch between fragmented tools for prompt engineering, evaluation, RAG, and fine-tuning. It solves the problem of "regression" in AI products, where improving one part of a prompt or upgrading a model might accidentally break other behaviors, by tracking quality across multiple dimensions using a single dataset.
How it works
Kiln combines a desktop application for non-technical collaborators (PMs, QA, and subject matter experts) with an MIT-licensed Python library for engineers. It operates on a local-first basis, allowing users to bring their own API keys or run models entirely offline via Ollama. The system syncs to Git for team collaboration and allows users to define a task once and then apply various optimization techniques—such as automatic prompt mutation, RAG integration, or fine-tuning—against that same dataset.
Who it’s for
It is designed for AI product teams, including engineers who need a production-ready library, data scientists working in notebooks, and non-technical stakeholders who need to be able to rate outputs and add training data without writing code.
Highlights
- Auto-Optimize: Automatically searches through prompt mutations and model selections to find the best configuration for a specific task.
- Eval Builder: Quickly generates synthetic evaluation datasets and judges to align AI outputs with user preferences.
- Multi-Agent Orchestration: Supports the composition of multi-agent hierarchies where each agent runs in its own focused context window.
- Zero-Code Fine-Tuning: Enables fine-tuning across 60+ models on providers like Fireworks, Together, and Vertex without writing code.
- Local-First Privacy: Runs on the user's machine, ensuring data control and Git-native synchronization.
Sources
- undefinedKiln-AI/Kiln