Building an Offline Voice System for macOS

How I built a fully offline text-to-speech and speech-to-text system for macOS using Kokoro-82M and mlx-whisper — no cloud APIs, streaming audio in under a second.

March 22, 2026 · 4 min · Victor Salles

Voice Automation

Offline voice I/O for macOS A complete text-to-speech and speech-to-text system for macOS that runs entirely on-device. No cloud APIs, no subscriptions — just local AI models doing real work. The problem: I spend hours reading and writing text on screen. I wanted a way to have my Mac read anything to me with a single hotkey, and transcribe audio without sending data to external servers. What it does: Press ⌥S and whatever text you’ve selected (or copied) gets read aloud using Kokoro-82M, an 82-million parameter TTS model running locally on Apple Silicon. The system automatically detects whether the text is Portuguese or English and picks the right voice. Audio starts streaming in under a second — no temp files, no waiting for the full synthesis to finish. ...

March 1, 2026 · 2 min · Victor Salles