Content Extraction¶

Configure templates, providers, and costs for AI-powered content extraction.

How It Works¶

Inkwell uses AI to extract structured information from podcast transcripts:

Transcript → Template Selection → AI Extraction → Markdown Output

Each template defines what to extract (quotes, summary, concepts, etc.) and produces a separate markdown file.

Available Templates¶

Template	Description	Output
`summary`	Episode overview and key takeaways	`summary.md`
`quotes`	Notable quotes with speaker attribution	`quotes.md`
`key-concepts`	Main ideas and concepts discussed	`key-concepts.md`
`tools-mentioned`	Software, apps, products mentioned	`tools-mentioned.md`
`books-mentioned`	Books and resources referenced	`books-mentioned.md`

Template Selection¶

Automatic (Default)¶

Templates are auto-selected based on episode category:

Category	Templates
`tech`	summary, quotes, tools-mentioned
`business`	summary, quotes, key-concepts
`interview`	summary, quotes

Manual Selection¶

Override with --templates:

# Specific templates
inkwell fetch URL --templates summary,quotes

# All templates
inkwell fetch URL --templates summary,quotes,key-concepts,tools-mentioned,books-mentioned

Category Override¶

Force a category to change auto-selection:

inkwell fetch URL --category tech

Providers¶

Inkwell supports two direct API providers for automatic extraction:

Gemini (Default)¶

Cost: ~$0.003 per template
Quality: Good for most use cases
Speed: Fast

Claude¶

Cost: ~$0.12 per template (40x more expensive)
Quality: Best for precision tasks
Speed: Moderate

Smart Selection (Default)¶

Inkwell automatically chooses based on template:

Gemini: summary, key-concepts, tools-mentioned
Claude: quotes, books-mentioned (precision matters)

Force Provider¶

# Use Gemini for everything (lowest cost)
inkwell fetch URL --provider gemini

# Use Claude for everything (highest quality)
inkwell fetch URL --provider claude

Local CLI users may instead make an explicit, non-fallback choice of --extractor codex or --extractor claude-code. See Local Codex Extraction and Local Claude Extraction. These backends never run in hosted workers and never enter automatic provider routing.

Costs¶

Cost Estimation¶

Check costs before processing:

inkwell fetch URL --dry-run

Output:

Estimated cost: $0.0090
  • summary (gemini): $0.0030
  • quotes (claude): $0.0045
  • key-concepts (gemini): $0.0015

Typical Costs¶

Episode Size	Templates	Provider	Cost
30 min (~5k words)	3	Gemini	~$0.003
30 min (~5k words)	3	Claude	~$0.045
120 min (~20k words)	5	Gemini	~$0.012
120 min (~20k words)	5	Claude	~$0.180

Track Spending¶

inkwell costs

Output:

┌ Overall ─────────────────────┐
│ Total Operations:  15         │
│ Total Cost:        $0.0825    │
│                               │
│ By Provider:                  │
│   gemini    $0.0525           │
│   claude    $0.0300           │
│                               │
│ By Operation:                 │
│   extraction    $0.0825       │
└───────────────────────────────┘

Caching¶

Extractions are cached to save time and money.

How It Works¶

Cache key includes transcript hash, template name/version, provider, model, prompt hash, output schema version, and cache format version
Default cache duration: 30 days
Cache location: ~/.cache/inkwell/extractions/

Cache Behavior¶

Scenario	Result
Same transcript, same template	Cache hit ($0)
Same transcript, different template	New extraction
Template version updated	Cache invalidated
Prompt, provider, model, or output schema changed	Cache invalidated

See Cache Behavior for transcript, extraction, and media/audio cache details.

Skip Cache¶

Force fresh extraction:

inkwell fetch URL --skip-cache

Concurrency¶

Templates are extracted in parallel for faster processing:

Default: 5 concurrent extractions
~5x speedup compared to sequential

Template Versioning¶

Templates include version numbers:

# templates/summary.yaml
name: summary
version: 2  # Incremented when prompt changes
expected_format: text

When a template is updated:

New version number invalidates cache
Next extraction uses updated prompt
Old cached results are ignored

Cost Optimization Tips¶

Use YouTube transcripts - They're free
Default to Gemini - 40x cheaper than Claude
Select fewer templates - Only extract what you need
Leverage caching - Don't re-process unnecessarily
Check with --dry-run - Know costs before committing

Next Steps¶

Interview Mode - Capture personal insights
Templates Reference - Template details
Configuration - Cost limits and defaults