Handy: a privacy-focused offline speech-to-text tool that pastes transcriptions directly into any active text field

What it solves

Handy is a privacy-focused, open-source speech-to-text application that allows users to transcribe spoken words directly into any active text field on their computer. It eliminates the need for cloud-based transcription services, ensuring that audio data remains local and private.

How it works

Users trigger transcription via a configurable keyboard shortcut or push-to-talk mode. The application uses Voice Activity Detection (VAD) via Silero to filter silence and then processes the audio using local ML models. It supports multiple model options, including various sizes of Whisper models (with GPU acceleration) and the CPU-optimized Parakeet V3 model for automatic language detection.

Who it’s for

It is designed for users seeking a free, private, and extensible speech-to-text tool that works entirely offline across Windows, macOS, and Linux.

Highlights

Fully Local: All processing happens on the user's machine, no cloud data transmission.
Cross-Platform: Native support for Windows, macOS, and Linux.
Flexible Model Support: Supports Whisper (Small, Medium, Turbo, Large) and Parakeet V3 models.
Extensible: Built as a Tauri application (Rust backend, React frontend) and designed to be easily forkable.
System Integration: Integrates with Raycast on macOS and supports CLI flags for remote control.

Handy: a privacy-focused offline speech-to-text tool that pastes transcriptions directly into any active text field

Handy: a privacy-focused offline speech-to-text tool that pastes transcriptions directly into any active text field

What it solves

How it works

Who it’s for

Highlights

Sources