TuriX-CUA: a computer-use agent that automates desktop GUI actions across any application without requiring specific APIs
TuriX-CUA: a computer-use agent that automates desktop GUI actions across any application without requiring specific APIs
What it solves
TuriX is a computer-use agent that allows users to automate desktop actions across various applications without needing app-specific APIs. It enables a user to "talk to their computer" and have the AI perform complex tasks—such as booking flights, searching for information and creating documents, or moving data between different software—directly on the GUI.
How it works
The system uses a Vision Language Model (VLM) as its "brain" to interpret the screen and plan actions. It can be configured with different models (e.g., via Turix API, Ollama, or other providers) through a config.json file. The agent can be extended with "Skills" (markdown playbooks) that provide specific instructions for the planner to follow when executing certain types of tasks. It also supports the Model Context Protocol (MCP) to integrate with other agents like Claude for Desktop.
Who it’s for
It is designed for personal and research use, targeting users who want to automate repetitive desktop workflows on macOS, Windows, and Linux.
Highlights
- API-Independent Automation: Operates on the GUI, meaning it can control any application a human can click.
- High Performance: Achieves a 64.2% success rate on the OSWorld benchmark and over 80% on a macOS-specific benchmark.
- Hot-Swappable Models: Allows users to easily change the underlying VLM policy without modifying code.
- Extensible Skills: Uses markdown-based playbooks to guide the agent's planning and execution.
- Cross-Platform Support: Available for macOS, Windows, and Linux.
Sources
- undefinedTurixAI/TuriX-CUA