ADR-013: LLM Provider Abstraction¶

Date: 2025-11-07 Status: Accepted Deciders: Phase 3 Team Related: ADR-016, LLM Comparison Research

Context¶

Phase 3 requires LLM integration for content extraction from podcast transcripts. Multiple LLM providers are available (Claude, Gemini, GPT-4, local models), each with different: - APIs and interfaces - Pricing models - Quality characteristics - Structured output support - Rate limits

We need to decide whether to: 1. Hard-code a single provider 2. Support multiple providers via abstraction 3. Build a pluggable provider system

Users have different priorities: - Some prioritize quality (willing to pay for Claude) - Some prioritize cost (prefer Gemini) - Some want privacy (local models) - Some want flexibility (switch providers)

Decision¶

We will implement an abstract BaseExtractor interface with concrete implementations for multiple LLM providers, starting with Claude and Gemini.

Architecture¶

# Abstract base
class BaseExtractor(ABC):
    @abstractmethod
    async def extract(
        self,
        template: ExtractionTemplate,
        transcript: str,
        metadata: dict,
    ) -> ExtractedContent:
        """Extract content using template and transcript"""
        pass

    @abstractmethod
    def estimate_cost(self, template: ExtractionTemplate, transcript_length: int) -> float:
        """Estimate extraction cost in USD"""
        pass

    @abstractmethod
    def supports_structured_output(self) -> bool:
        """Whether provider supports structured output (JSON mode)"""
        pass

# Concrete implementations
class ClaudeExtractor(BaseExtractor):
    """Claude API extractor"""
    ...

class GeminiExtractor(BaseExtractor):
    """Gemini API extractor"""
    ...

# Future
class LocalExtractor(BaseExtractor):
    """Local model extractor (Ollama, llama.cpp)"""
    ...

# Factory
class ExtractorFactory:
    @staticmethod
    def create(provider: str, api_key: str, **kwargs) -> BaseExtractor:
        if provider == "claude":
            return ClaudeExtractor(api_key, **kwargs)
        elif provider == "gemini":
            return GeminiExtractor(api_key, **kwargs)
        else:
            raise ValueError(f"Unknown provider: {provider}")

Configuration¶

# Global default
default_llm_provider: claude

# Provider API keys (via environment or config)
anthropic_api_key: ${ANTHROPIC_API_KEY}
google_api_key: ${GOOGLE_API_KEY}

# Provider-specific settings
llm_providers:
  claude:
    model: claude-sonnet-4-5-20250929
    default_temperature: 0.3
    default_max_tokens: 2000
  gemini:
    model: gemini-2.0-flash-exp
    default_temperature: 0.3
    default_max_tokens: 2000

Template-Level Override¶

# Template can specify preferred provider
name: quotes
model_preference: claude  # Use Claude for this template

# Or let system decide
name: summary
model_preference: auto  # Use default provider

CLI Override¶

# Use specific provider
inkwell fetch "podcast" --latest --llm-provider gemini

# Use default
inkwell fetch "podcast" --latest

Alternatives Considered¶

Alternative 1: Hard-code Claude Only¶

Pros: - Simplest implementation - Best quality (Claude excels at extraction) - No abstraction overhead - Fewer edge cases

Cons: - No flexibility for users - Vendor lock-in - Higher costs for all users - Can't use free Gemini tier - No path to local models

Rejected because: Users need cost/quality trade-offs

Alternative 2: Hard-code Gemini Only¶

Pros: - Cheapest option - Generous free tier - Good for experimentation - Large context window

Cons: - Lower quality extractions - Less consistent structured output - Professional users need better quality - Quote accuracy issues

Rejected because: Quality matters for production use

Alternative 3: User Must Choose (No Default)¶

Pros: - Forces conscious choice - No opinionated defaults - Clear cost implications

Cons: - Poor user experience (friction) - Requires understanding of providers - Extra configuration burden - Decision paralysis

Rejected because: Defaults should work out-of-box

Alternative 4: Always Ask at Runtime¶

Pros: - User control every time - Cost visibility

Cons: - Annoying for repeated use - Breaks automation - Poor CLI UX

Rejected because: Too much friction

Rationale¶

Why Abstraction?¶

User Flexibility: Different users have different priorities
Budget-conscious → Gemini
Quality-focused → Claude
Privacy-focused → Local models (future)
Template-Specific Needs: Different templates have different requirements
Quotes → Claude (precision critical)
Summary → Either (both work well)
Experiments → Gemini (cheap iteration)
Future-Proofing: New providers will emerge
GPT-4 structured output
Local models (Ollama, llama.cpp)
Specialized models
Cost Optimization: Smart provider selection
Auto-route based on template complexity
Fallback to cheaper provider if quality sufficient
Batch processing with cost-effective provider
Vendor Independence: Avoid lock-in
Provider pricing changes
Service availability
API changes

Why Not Full Plugin System?¶

Considered but deferred: - Full plugin architecture with external providers - User-written provider plugins - Dynamic provider loading

Reasons: - Over-engineering for current needs - Security risk (running user code) - Complexity not justified yet - Can add later if needed

Current approach: - Built-in providers (Claude, Gemini) - Clean interface for adding more - Extensible without plugins

Consequences¶

Positive¶

✅ User Choice: Users select cost/quality trade-off ✅ Flexibility: Easy to add new providers ✅ Testing: Can mock providers for tests ✅ Cost Control: Users can optimize costs ✅ Future-Proof: Ready for new models

Negative¶

❌ Complexity: More code than single provider ❌ Maintenance: Must update multiple integrations ❌ Testing Burden: Test all providers ❌ Documentation: Explain provider differences ❌ Configuration: More settings to manage

Mitigations¶

Good Defaults: Claude default works well
Clear Docs: Explain when to use which provider
Shared Code: Abstract common logic
Comprehensive Tests: Mock all providers
Provider Recommendations: Guide users to right choice

Implementation Plan¶

Phase 1: Core Abstraction (Unit 4)¶

Define BaseExtractor interface
Implement ClaudeExtractor
Implement GeminiExtractor
Create ExtractorFactory

Phase 2: Configuration (Unit 4)¶

Add provider configuration to config schema
Environment variable support
Template-level override
CLI flag support

Phase 3: Testing (Unit 4-5)¶

Mock extractor for tests
Test each provider implementation
Integration tests with real APIs
Cost estimation tests

Phase 4: Documentation (Unit 4)¶

Provider comparison guide
Configuration examples
When to use which provider
Cost optimization tips

Validation¶

Success Criteria¶

✅ Can switch providers via config ✅ Templates can specify preferred provider ✅ CLI can override provider ✅ Costs estimated accurately per provider ✅ Tests pass with mocked providers ✅ Clear error messages when API key missing

Testing Strategy¶

# Unit tests with mock
def test_extract_with_claude(mock_anthropic):
    extractor = ClaudeExtractor(api_key="test")
    result = await extractor.extract(template, transcript, metadata)
    assert result.content is not None

# Integration tests (marked slow)
@pytest.mark.integration
@pytest.mark.skipif(not os.getenv("ANTHROPIC_API_KEY"), reason="API key required")
def test_real_claude_extraction():
    extractor = ClaudeExtractor(api_key=os.getenv("ANTHROPIC_API_KEY"))
    result = await extractor.extract(sample_template, sample_transcript, {})
    assert result.confidence > 0.8

ADR-016: API Provider Abstraction - Provider abstraction for API access
LLM Extraction Comparison - Provider comparison research
Structured Extraction Patterns - How to prompt each provider

References¶

Revision History¶

2025-11-07: Initial decision (Phase 3 Unit 1)