What: Convert audio files (voice notes, meetings, board calls) to text using OpenAI’s Whisper model locally. No API key, no cloud.
When I use it: Meeting transcriptions, Tarcon board meetings, voice note processing.
Quick Start
whisper recording.m4a --model tiny --output_format txt --output_dir /tmp/whisper_out
Key Options
| Flag | Purpose |
|---|---|
--model tiny | Fast, low RAM (~1GB). Use on VPS |
--model base | Better accuracy, still light |
--model small | Good quality, needs ~2GB RAM |
--output_format txt | Plain text. Also: srt, vtt, json |
--language en | Force English (skips detection) |
Tips
- VPS: Always use
--model tiny— the box can’t handle larger models - Zim accents: Feed known vocabulary via
--initial_prompt "Tarcon, Tongogara, Chinhoyi, Ziumbe, Gudo"to improve proper noun accuracy - Voice notes land in
~/.moltbot/media/inbound/as.ogg— sort by date:ls -lt *.ogg | head -5 - Post-processing: Keep a find-and-replace dictionary for known garbles (e.g. “Cito Faran” → “City of Harare”)
Real Example
Transcribing a Tarcon board meeting (1.5 hours, .m4a):
whisper ~/vault-munatsi/meetings/tarcon_board_20251023.m4a \
--model tiny \
--output_format txt \
--output_dir ~/vault-munatsi/meetings/ \
--initial_prompt "Tarcon, Tongogara, Chinhoyi, Nyamapanda, Zvamadzi, Ziumbe, Gudo, Rainsford"
What Chipo Does With It
- Transcribe the audio
- Cross-reference against board pack documents for accuracy
- Produce formal minutes in the company’s house style
- Generate PDF for distribution
Built by Chipo · Last updated 2026-02-27