Podcast Transcription
Generate transcripts for podcast episodes using local Whisper AI. All processing happens on your device.
Overview
After archiving a podcast episode, Social Archiver can generate a full transcript using OpenAI's Whisper speech recognition model running locally on your computer.
Why Local Processing?
This feature is designed with two considerations in mind:
Privacy: Your audio never leaves your device. All transcription happens locally using open-source tools, ensuring complete privacy for sensitive content.
No API Costs: Unlike cloud-based transcription services that charge per minute of audio, local Whisper is completely free to use once installed.
The trade-off is that you need to install additional tools and transcription speed depends on your computer's performance.
Requirements
You need one of the following speech recognition tools installed:
| Tool | CLI Included | Speed | Auto Model Download |
|---|---|---|---|
| faster-whisper | No (requires wrapper) | Fastest | ✓ Yes |
| openai-whisper | Yes | Moderate | ✓ Yes |
| whisper.cpp | Yes | Fast | ✗ Manual |
Recommendation
- Apple Silicon Mac (M1/M2/M3/M4): Use whisper.cpp for best performance (Metal GPU acceleration)
- Windows: Use openai-whisper for easiest setup (CLI included, no wrapper needed)
- Linux: Use faster-whisper for best CPU performance (auto model download)
Option 1: faster-whisper (Recommended)
CTranslate2-based implementation. Up to 4x faster than openai-whisper with lower memory usage. Models are automatically downloaded on first use.
- Repository: github.com/SYSTRAN/faster-whisper
- Models: Auto-downloaded to
~/.cache/huggingface/
CLI Wrapper Required
faster-whisper is a Python library without a built-in CLI. You need to install our CLI wrapper script.
Windows Users
For Windows, we recommend using openai-whisper instead (see Option 2 below). It includes a CLI out of the box and requires no additional setup. If you still want to use faster-whisper on Windows, follow the Windows-specific instructions below.
macOS / Linux Installation
Step 1: Install the library
pip install faster-whisper# macOS Homebrew blocks system-wide pip installs (PEP 668)
# Create a dedicated virtual environment instead:
# Create venv and install faster-whisper
python3 -m venv ~/.local/share/faster-whisper-venv
~/.local/share/faster-whisper-venv/bin/pip install faster-whisperStep 2: Install CLI wrapper
Download the wrapper script and save it to your PATH:
# Create bin directory if it doesn't exist
mkdir -p ~/.local/bin
# Download wrapper script
curl -o ~/.local/bin/faster-whisper \
https://raw.githubusercontent.com/hyungyunlim/obsidian-social-archiver-releases/main/faster-whisper-cli.py
# Make it executable
chmod +x ~/.local/bin/faster-whispermacOS Users: Update the shebang
If you installed faster-whisper in a venv (Step 1), you need to update the script's shebang to use the venv Python:
# Replace the first line of the script
sed -i '' '1s|.*|#!/Users/'$USER'/.local/share/faster-whisper-venv/bin/python|' ~/.local/bin/faster-whisperStep 3: Add to PATH (if not already)
# Add to your shell config (~/.zshrc or ~/.bashrc)
export PATH="$HOME/.local/bin:$PATH"
# Reload shell
source ~/.zshrc # or ~/.bashrcVerify installation:
faster-whisper --versionWindows Installation
Step 1: Install the library
pip install faster-whisperStep 2: Install CLI wrapper
Open PowerShell and run:
# Download wrapper script to Python Scripts folder
# First, find your Python Scripts path
python -c "import sys; print(sys.prefix + '\\Scripts')"
# Download the wrapper (adjust path if needed)
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/hyungyunlim/obsidian-social-archiver-releases/main/faster-whisper-cli.py" -OutFile "$env:LOCALAPPDATA\Programs\Python\Python311\Scripts\faster-whisper.py"Step 3: Create a batch wrapper
Create a file named faster-whisper.bat in the same Scripts folder:
@echo off
python "%~dp0faster-whisper.py" %*Or run this PowerShell command:
$scriptsPath = python -c "import sys; print(sys.prefix + '\\Scripts')"
Set-Content -Path "$scriptsPath\faster-whisper.bat" -Value '@echo off`r`npython "%~dp0faster-whisper.py" %*'Verify installation:
faster-whisper --versionOption 2: openai-whisper
The original Python implementation by OpenAI. Easy to install, works out of the box. Models are automatically downloaded on first use.
- Repository: github.com/openai/whisper
- Models: Auto-downloaded to
~/.cache/whisper/
Installation:
# Requires Python 3.8+
pip install openai-whisper
# Or with pipx (recommended for isolation)
pipx install openai-whisperVerify installation:
whisper --helpOption 3: whisper.cpp
High-performance C++ implementation. Best performance on Apple Silicon Macs with Metal GPU acceleration.
- Repository: github.com/ggerganov/whisper.cpp
- Models: huggingface.co/ggerganov/whisper.cpp
Manual Model Download Required
whisper.cpp requires you to manually download GGML model files before use.
Installation:
# macOS (using Homebrew)
brew install whisper-cpp
# Or build from source
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build && cmake --build build --config ReleaseDownload models:
# Create models directory
mkdir -p ~/whisper-models
# Download small model (recommended, 465MB)
curl -L -o ~/whisper-models/ggml-small.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# Other models available:
# ggml-tiny.bin (74MB), ggml-base.bin (142MB)
# ggml-medium.bin (1.5GB), ggml-large-v3.bin (2.9GB)The plugin searches for models in these locations:
~/whisper-models/~/.cache/whisper-cpp/~/whisper.cpp/models/
Verify installation:
whisper-cli --helpPlugin Settings
Configure transcription in Settings → Social Archiver → Transcription Settings:
| Setting | Description |
|---|---|
| Enable Whisper transcription | Toggle transcription feature on/off |
| Preferred Whisper variant | Choose which Whisper implementation to use |
| Preferred model | Select model size (tiny to large) |
| Default language | Auto-detect or specify language |
| Custom Whisper path | Override automatic binary detection |
Choosing a Variant
If you have multiple Whisper variants installed, you can select which one to use:
- Auto-detect: On Apple Silicon Mac, tries whisper.cpp first; on other systems, tries faster-whisper first
- faster-whisper: Easiest setup, auto model download, great CPU performance
- openai-whisper: Original implementation, easy setup, auto model download
- whisper.cpp: Best for Apple Silicon Macs (Metal GPU acceleration), requires manual model download
Models Are Not Shared
Each variant uses a different model format. Models downloaded for one variant cannot be used by another:
| Variant | Model Format | Storage Location |
|---|---|---|
| faster-whisper | CTranslate2 | ~/.cache/huggingface/ |
| openai-whisper | PyTorch (.pt) | ~/.cache/whisper/ |
| whisper.cpp | GGML (.bin) | Manual location |
How It Works
- Archive a podcast: First, subscribe to a podcast or archive an episode with audio
- Open in Timeline: View the archived episode in the Timeline view
- Click transcribe button: Click the microphone icon on the podcast card
- Wait for processing: Transcription progress is shown in the button
- View transcript: Expandable transcript appears below the audio player
Transcript Features
Interactive Timestamps
Click any timestamp in the transcript to jump to that point in the audio. The current segment is highlighted during playback.
Search (Desktop)
Use the search box to find specific words or phrases in the transcript.
Collapse/Expand
Transcripts start collapsed to save space. Click to expand and view the full content.
Transcript Storage
Transcripts are saved directly in your markdown files under a ## Transcript section:
## Transcript
[00:00] Welcome to the podcast...
[00:15] Today we're discussing...Metadata is stored in the YAML frontmatter:
transcriptionModel: small
transcriptionLanguage: en
transcriptionDuration: 1847.5
transcriptionTime: 2024-12-12T05:30:00.000ZModel Selection
Choose a model based on your needs:
| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
| tiny | 74MB | ~32x | Low | Quick previews |
| base | 142MB | ~16x | Fair | Short clips |
| small | 466MB | ~6x | Good | Default, balanced |
| medium | 1.5GB | ~2x | High | Important content |
| large | 2.9GB | ~1x | Best | Maximum accuracy |
Speed is relative to audio length (e.g., ~6x means 10 min audio takes ~1.5 min to transcribe).
Recommendation
Start with the small model. It offers a good balance of speed and accuracy for most podcasts.
Troubleshooting
Transcribe Button Doesn't Appear
Cause: No Whisper tool detected in system PATH
Solution:
- Verify installation with
whisper --versionorfaster-whisper --version - Restart Obsidian after installing tools
- Check that the tool is in your system PATH
Transcription Fails or Times Out
Common causes:
- Insufficient memory for the selected model
- Corrupted audio file
- Very long audio (2+ hours)
Solutions:
- Try a smaller model (e.g.,
tinyorbase) - Verify the audio file plays correctly
- Check available disk space and memory
Poor Transcription Quality
Common causes:
- Background noise in audio
- Multiple speakers talking over each other
- Non-standard accents or technical jargon
Solutions:
- Use a larger model (
mediumorlarge) - Specify the language in settings if auto-detection fails
Slow Transcription
Transcription speed depends on:
- Your CPU/GPU performance
- Selected model size
- Audio length
Performance Tips
- Apple Silicon Macs: Use
whisper.cppfor best performance - NVIDIA GPU: Use
faster-whisperwith CUDA support - CPU only: Use
tinyorbasemodels for reasonable speed
Language Support
Whisper supports 99+ languages with automatic detection. For best results with non-English content:
- Let auto-detection identify the language first
- If detection fails, manually specify the language in plugin settings
Privacy & Storage
Local Processing Only
All transcription happens locally on your device. Audio files are never uploaded to any server. The transcript is stored only in your Obsidian vault.
Disk Space
Transcripts are text-only and very small (typically under 100KB even for long podcasts). They won't significantly impact your vault size.