Handy: a privacy-focused offline speech-to-text tool that pastes transcriptions directly into any active text field
Handy: a privacy-focused offline speech-to-text tool that pastes transcriptions directly into any active text field
What it solves
Handy is a privacy-focused, open-source speech-to-text application that allows users to transcribe spoken words directly into any active text field on their computer. It eliminates the need for cloud-based transcription services, ensuring that audio data remains local and private.
How it works
Users trigger transcription via a configurable keyboard shortcut or push-to-talk mode. The application uses Voice Activity Detection (VAD) via Silero to filter silence and then processes the audio using local ML models. It supports multiple model options, including various sizes of Whisper models (with GPU acceleration) and the CPU-optimized Parakeet V3 model for automatic language detection.
Who it’s for
It is designed for users seeking a free, private, and extensible speech-to-text tool that works entirely offline across Windows, macOS, and Linux.
Highlights
- Fully Local: All processing happens on the user's machine, no cloud data transmission.
- Cross-Platform: Native support for Windows, macOS, and Linux.
- Flexible Model Support: Supports Whisper (Small, Medium, Turbo, Large) and Parakeet V3 models.
- Extensible: Built as a Tauri application (Rust backend, React frontend) and designed to be easily forkable.
- System Integration: Integrates with Raycast on macOS and supports CLI flags for remote control.
Sources
- undefinedcjpais/Handy