Podcast Transcription

Generate transcripts for podcast episodes using local Whisper AI. All processing happens on your device.

Overview

After archiving a podcast episode, Social Archiver can generate a full transcript using OpenAI's Whisper speech recognition model running locally on your computer.

Why Local Processing?

This feature is designed with two considerations in mind:

Privacy: Your audio never leaves your device. All transcription happens locally using open-source tools, ensuring complete privacy for sensitive content.
No API Costs: Unlike cloud-based transcription services that charge per minute of audio, local Whisper is completely free to use once installed.

The trade-off is that you need to install additional tools and transcription speed depends on your computer's performance.

Requirements

You need one of the following speech recognition tools installed:

Tool	CLI Included	Speed	Auto Model Download
faster-whisper	No (requires wrapper)	Fastest	✓ Yes
openai-whisper	Yes	Moderate	✓ Yes
whisper.cpp	Yes	Fast	✗ Manual

Recommendation

Apple Silicon Mac (M1/M2/M3/M4): Use whisper.cpp for best performance (Metal GPU acceleration)
Windows: Use openai-whisper for easiest setup (CLI included, no wrapper needed)
Linux: Use faster-whisper for best CPU performance (auto model download)

Option 1: faster-whisper (Recommended)

CTranslate2-based implementation. Up to 4x faster than openai-whisper with lower memory usage. Models are automatically downloaded on first use.

Repository: github.com/SYSTRAN/faster-whisper
Models: Auto-downloaded to ~/.cache/huggingface/

CLI Wrapper Required

faster-whisper is a Python library without a built-in CLI. You need to install our CLI wrapper script.

Windows Users

For Windows, we recommend using openai-whisper instead (see Option 2 below). It includes a CLI out of the box and requires no additional setup. If you still want to use faster-whisper on Windows, follow the Windows-specific instructions below.

macOS / Linux Installation

Step 1: Install the library

LinuxmacOS (Homebrew)

bash

pip install faster-whisper

bash

# macOS Homebrew blocks system-wide pip installs (PEP 668)
# Create a dedicated virtual environment instead:

# Create venv and install faster-whisper
python3 -m venv ~/.local/share/faster-whisper-venv
~/.local/share/faster-whisper-venv/bin/pip install faster-whisper

Step 2: Install CLI wrapper

Download the wrapper script and save it to your PATH:

bash

# Create bin directory if it doesn't exist
mkdir -p ~/.local/bin

# Download wrapper script
curl -o ~/.local/bin/faster-whisper \
  https://raw.githubusercontent.com/hyungyunlim/obsidian-social-archiver-releases/main/faster-whisper-cli.py

# Make it executable
chmod +x ~/.local/bin/faster-whisper

macOS Users: Update the shebang

If you installed faster-whisper in a venv (Step 1), you need to update the script's shebang to use the venv Python:

bash

# Replace the first line of the script
sed -i '' '1s|.*|#!/Users/'$USER'/.local/share/faster-whisper-venv/bin/python|' ~/.local/bin/faster-whisper

Step 3: Add to PATH (if not already)

bash

# Add to your shell config (~/.zshrc or ~/.bashrc)
export PATH="$HOME/.local/bin:$PATH"

# Reload shell
source ~/.zshrc  # or ~/.bashrc

Verify installation:

bash

faster-whisper --version

Windows Installation

Step 1: Install the library

powershell

pip install faster-whisper

Step 2: Install CLI wrapper

Open PowerShell and run:

powershell

# Download wrapper script to Python Scripts folder
# First, find your Python Scripts path
python -c "import sys; print(sys.prefix + '\\Scripts')"

# Download the wrapper (adjust path if needed)
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/hyungyunlim/obsidian-social-archiver-releases/main/faster-whisper-cli.py" -OutFile "$env:LOCALAPPDATA\Programs\Python\Python311\Scripts\faster-whisper.py"

Step 3: Create a batch wrapper

Create a file named faster-whisper.bat in the same Scripts folder:

batch

@echo off
python "%~dp0faster-whisper.py" %*

Or run this PowerShell command:

powershell

$scriptsPath = python -c "import sys; print(sys.prefix + '\\Scripts')"
Set-Content -Path "$scriptsPath\faster-whisper.bat" -Value '@echo off`r`npython "%~dp0faster-whisper.py" %*'

Verify installation:

powershell

faster-whisper --version

Option 2: openai-whisper

The original Python implementation by OpenAI. Easy to install, works out of the box. Models are automatically downloaded on first use.

Repository: github.com/openai/whisper
Models: Auto-downloaded to ~/.cache/whisper/

macOS / Linux Installation

bash

# Requires Python 3.8+
pip install openai-whisper

# Or with pipx (recommended for isolation)
pipx install openai-whisper

Verify installation:

bash

whisper --help

Windows Installation

Prerequisites Required

Windows users must install FFmpeg and ensure PyTorch is properly configured before using openai-whisper.

Step 1: Install FFmpeg

FFmpeg is required for audio processing. Choose one method:

winget (Recommended)ChocolateyManual

powershell

winget install ffmpeg

powershell

choco install ffmpeg

powershell

# 1. Download from https://ffmpeg.org/download.html (Windows builds)
# 2. Extract to C:\ffmpeg
# 3. Add C:\ffmpeg\bin to system PATH

After installation, verify FFmpeg is in PATH:

powershell

ffmpeg -version

Step 2: Install openai-whisper

powershell

# Requires Python 3.8+
pip install openai-whisper

This will automatically install PyTorch. For NVIDIA GPU acceleration, install the CUDA version first:

powershell

# Optional: Install PyTorch with CUDA support (for NVIDIA GPUs)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Then install whisper
pip install openai-whisper

Step 3: Add Python Scripts to PATH (if not already)

powershell

# Find your Python Scripts path
python -c "import sys; print(sys.prefix + '\\Scripts')"

# Add this path to your system PATH environment variable
# Usually: C:\Users\<username>\AppData\Local\Programs\Python\Python3xx\Scripts

Verify installation:

powershell

whisper --help

Troubleshooting

If whisper command is not found:

Restart your terminal/PowerShell after PATH changes
Try running as python -m whisper instead
Ensure the Python Scripts folder is in your PATH

Option 3: whisper.cpp

High-performance C++ implementation. Best performance on Apple Silicon Macs with Metal GPU acceleration.

Repository: github.com/ggerganov/whisper.cpp
Models: huggingface.co/ggerganov/whisper.cpp

Manual Model Download Required

whisper.cpp requires you to manually download GGML model files before use.

Installation:

bash

# macOS (using Homebrew)
brew install whisper-cpp

# Or build from source
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build && cmake --build build --config Release

Download models:

bash

# Create models directory
mkdir -p ~/whisper-models

# Download small model (recommended, 465MB)
curl -L -o ~/whisper-models/ggml-small.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin

# Other models available:
# ggml-tiny.bin (74MB), ggml-base.bin (142MB)
# ggml-medium.bin (1.5GB), ggml-large-v3.bin (2.9GB)

The plugin searches for models in these locations:

~/whisper-models/
~/.cache/whisper-cpp/
~/whisper.cpp/models/

Verify installation:

bash

whisper-cli --help

Plugin Settings

Configure transcription in Settings → Social Archiver → Transcription Settings:

Setting	Description
Enable Whisper transcription	Toggle transcription feature on/off
Preferred Whisper variant	Choose which Whisper implementation to use
Preferred model	Select model size (tiny to large)
Default language	Auto-detect or specify language
Custom Whisper path	Override automatic binary detection

Choosing a Variant

If you have multiple Whisper variants installed, you can select which one to use:

Auto-detect: On Apple Silicon Mac, tries whisper.cpp first; on other systems, tries faster-whisper first
faster-whisper: Easiest setup, auto model download, great CPU performance
openai-whisper: Original implementation, easy setup, auto model download
whisper.cpp: Best for Apple Silicon Macs (Metal GPU acceleration), requires manual model download

Models Are Not Shared

Each variant uses a different model format. Models downloaded for one variant cannot be used by another:

Variant	Model Format	Storage Location
faster-whisper	CTranslate2	`~/.cache/huggingface/`
openai-whisper	PyTorch (.pt)	`~/.cache/whisper/`
whisper.cpp	GGML (.bin)	Manual location

How It Works

Archive a podcast: First, subscribe to a podcast or archive an episode with audio
Open in Timeline: View the archived episode in the Timeline view
Click transcribe button: Click the microphone icon on the podcast card
Wait for processing: Transcription progress is shown in the button
View transcript: Expandable transcript appears below the audio player

Transcript Features

Interactive Timestamps

Click any timestamp in the transcript to jump to that point in the audio. The current segment is highlighted during playback.

Search (Desktop)

Use the search box to find specific words or phrases in the transcript.

Collapse/Expand

Transcripts start collapsed to save space. Click to expand and view the full content.

Transcript Storage

Transcripts are saved directly in your markdown files under a ## Transcript section:

markdown

## Transcript

[00:00] Welcome to the podcast...

[00:15] Today we're discussing...

Metadata is stored in the YAML frontmatter:

yaml

transcriptionModel: small
transcriptionLanguage: en
transcriptionDuration: 1847.5
transcriptionTime: 2024-12-12T05:30:00.000Z

Model Selection

Choose a model based on your needs:

Model	Size	Speed	Accuracy	Best For
tiny	74MB	~32x	Low	Quick previews
base	142MB	~16x	Fair	Short clips
small	466MB	~6x	Good	Default, balanced
medium	1.5GB	~2x	High	Important content
large	2.9GB	~1x	Best	Maximum accuracy

Speed is relative to audio length (e.g., ~6x means 10 min audio takes ~1.5 min to transcribe).

Recommendation

Start with the small model. It offers a good balance of speed and accuracy for most podcasts.

Troubleshooting

Transcribe Button Doesn't Appear

Cause: No Whisper tool detected in system PATH

Solution:

Verify installation with whisper --version or faster-whisper --version
Restart Obsidian after installing tools
Check that the tool is in your system PATH

Transcription Fails or Times Out

Common causes:

Insufficient memory for the selected model
Corrupted audio file
Very long audio (2+ hours)

Solutions:

Try a smaller model (e.g., tiny or base)
Verify the audio file plays correctly
Check available disk space and memory

Poor Transcription Quality

Common causes:

Background noise in audio
Multiple speakers talking over each other
Non-standard accents or technical jargon

Solutions:

Use a larger model (medium or large)
Specify the language in settings if auto-detection fails

Slow Transcription

Transcription speed depends on:

Your CPU/GPU performance
Selected model size
Audio length

Performance Tips

Apple Silicon Macs: Use whisper.cpp for best performance
NVIDIA GPU: Use faster-whisper with CUDA support
CPU only: Use tiny or base models for reasonable speed

Language Support

Whisper supports 99+ languages with automatic detection. For best results with non-English content:

Let auto-detection identify the language first
If detection fails, manually specify the language in plugin settings

Privacy & Storage

Local Processing Only

All transcription happens locally on your device. Audio files are never uploaded to any server. The transcript is stored only in your Obsidian vault.

Disk Space

Transcripts are text-only and very small (typically under 100KB even for long podcasts). They won't significantly impact your vault size.

Podcast Transcription ​

Overview ​

Why Local Processing? ​

Requirements ​

Option 1: faster-whisper (Recommended) ​

macOS / Linux Installation ​

Windows Installation ​

Option 2: openai-whisper ​

macOS / Linux Installation ​

Windows Installation ​

Option 3: whisper.cpp ​

Plugin Settings ​

Choosing a Variant ​

How It Works ​

Transcript Features ​

Interactive Timestamps ​

Search (Desktop) ​

Collapse/Expand ​

Transcript Storage ​

Model Selection ​

Troubleshooting ​

Transcribe Button Doesn't Appear ​

Transcription Fails or Times Out ​

Poor Transcription Quality ​

Slow Transcription ​

Language Support ​

Privacy & Storage ​

Podcast Transcription

Overview

Why Local Processing?

Requirements

Option 1: faster-whisper (Recommended)

macOS / Linux Installation

Windows Installation

Option 2: openai-whisper

macOS / Linux Installation

Windows Installation

Option 3: whisper.cpp

Plugin Settings

Choosing a Variant

How It Works

Transcript Features

Interactive Timestamps

Search (Desktop)

Collapse/Expand

Transcript Storage

Model Selection

Troubleshooting

Transcribe Button Doesn't Appear

Transcription Fails or Times Out

Poor Transcription Quality

Slow Transcription

Language Support

Privacy & Storage