SkillOpt: an executive strategy for self-evolving agent skills using a deep-learning-style optimization loop
SkillOpt: an executive strategy for self-evolving agent skills using a deep-learning-style optimization loop
What it solves
SkillOpt addresses the lack of reproducible, disciplined optimization for AI agent skills. While most agent skills are hand-crafted or generated in a single pass, they often fail to improve reliably under feedback. SkillOpt treats the skill document itself as the "trainable state" of a frozen model, allowing skills to evolve and improve without needing to modify the model's actual weights.
How it works
SkillOpt implements a training loop for textual skills using concepts from deep learning. It uses a separate optimizer model to analyze scored rollouts and perform bounded edits (adding, deleting, or replacing text) on a skill document. A candidate edit is only accepted if it strictly improves a held-out validation score. To maintain stability, it employs a textual learning-rate budget, a rejected-edit buffer, and epoch-wise updates. The final result is a compact best_skill.md file that can be used with any target model at inference time with zero additional overhead.
Who it’s for
This tool is designed for developers and researchers building AI agents (such as those using Claude Code, Codex, or Copilot) who want to actually optimize the performance of their agents on specific tasks or benchmarks without performing expensive model fine-tuning.
Highlights
- Weight-Free Optimization: Improves agent performance without touching model weights.
- Zero Inference Overhead: The optimized skill artifact is a simple markdown file used at deployment.
- High Performance: Demonstrated significant accuracy lifts across multiple benchmarks and target models (e.g., GPT-5.5).
- Cross-Model Transfer: Optimized skills can transfer across different model scales and execution harnesses.
- Extensible Architecture: Supports multiple backends (OpenAI, Azure, Claude, Qwen, MiniMax) and allows for easy addition of new benchmarks.
Sources
- undefinedmicrosoft/SkillOpt