Skip to content

SOPs

Whisper Model Inventory

openai-whisper caches downloaded model weights at ~/.cache/whisper/ as .pt files. Each model is one file. The whisper CLI and the Python whisper.load_model(name) both read from this directory, so whatever is present is immediately usable — no install step beyond the initial download.

Audited 2026-04-19. Paths are ~/.cache/whisper/<name>.pt on each machine.

ModelSizeMacBook ProMac StudioNotes
tiny72 MBYesNoFastest, low accuracy. Debugging only.
base139 MBYesYesDecent default for English word timestamps.
small461 MBYesYesBetter accuracy, still fast on M-series.
medium1.4 GBYesYesEnglish near-human quality; slower.
large-v3-turbo1.5 GBYesYesWhisper Turbo — fastest large-quality model. Preferred for production transcription.
large-v32.9 GBYesNoHighest accuracy, slowest. Rarely needed over turbo.

Total cache size: MacBook Pro ~6.5 GB, Mac Studio ~3.5 GB.

Released by OpenAI October 2024. Same encoder as large-v3, much smaller decoder trained to match large-v3 output. Roughly 8x faster than large-v3 at near-identical quality for English. This is the right default when the task needs real-quality transcription (word timestamps, long-form audio, noisy environments).

In code: whisper.load_model("large-v3-turbo") or whisper.load_model("turbo") (alias).

Whisper itself is a Python package:

Terminal window
pip show openai-whisper # confirm installed
which whisper # confirm CLI on PATH

MacBook Pro: installed, CLI at ~/.local/bin/whisper. Mac Studio: models are cached but the whisper CLI is not on PATH — Python import-and-use still works if a project imports whisper directly, but the CLI is not wired up. Re-install if CLI usage is needed: pip install --user openai-whisper.

First use of a model auto-downloads it from OpenAI’s CDN to ~/.cache/whisper/. Either trigger it from Python:

import whisper
whisper.load_model("large-v3-turbo") # downloads if missing, otherwise loads from cache

…or via the CLI:

Terminal window
whisper --model large-v3-turbo some-audio.mp3

Download happens once per model per machine. No re-download on subsequent runs.

Models are large binary files. They do NOT belong in a git repo. Two valid ways to get a model onto another machine:

  1. Let the machine download it on first use — simplest, works offline from that point on.
  2. Copy the .pt file over SSH — faster if the other machine already has the bandwidth-heavy download on local network:
    Terminal window
    scp ~/.cache/whisper/large-v3-turbo.pt studio:~/.cache/whisper/

Do not symlink between machines or share the cache dir over cloud storage — the files are binary blobs that openai-whisper loads once into memory, and any lazy-loading cloud sync (Google Drive, iCloud Drive) will add startup latency or corrupt the read.

TaskRecommended
Quick word-boundary detection on English voice notesbase
Real transcription of recorded calls / podcastslarge-v3-turbo
Multi-language audiolarge-v3-turbo or large-v3
Running on the VPS or low-memory machinesmall (fits in 1 GB RAM with headroom)

Rule of thumb: start with base while iterating, upgrade to large-v3-turbo for the final pass. Going straight to large-v3 is rarely worth the 2x slowdown over turbo.

  • audio-slow --words mode uses Whisper for editor-precision word boundaries (~/apps/audio-slow/audio_slow.py).
  • youtube-auto-commenter and tms-ops use Whisper for transcription of captured audio.