Template Format Evaluation: YAML vs Alternatives¶

Date: 2025-11-07 Author: Phase 3 Research Status: Complete Related: Phase 3 Plan

Overview¶

This document evaluates different formats for defining extraction templates. Templates control how content is extracted from transcripts and must be human-readable, validatable, and extensible. We compare YAML, TOML, JSON, and Python dataclasses.

Requirements for Template Format¶

Functional Requirements¶

Human Readable: Non-developers should be able to create templates
Comments Support: Document template purpose and variables
Multi-line Strings: Prompts can be long and complex
Validation: Schema validation for correctness
Version Control Friendly: Diffs should be readable
Extensible: Easy to add new fields without breaking old templates

Non-Functional Requirements¶

Parsing Speed: Fast template loading
Error Messages: Clear validation errors
Editor Support: Syntax highlighting, autocomplete
Standard Library: Preferably no exotic dependencies
Ecosystem: Good tooling and examples

Format Comparison¶

1. YAML (YAML Ain't Markup Language)¶

Example Template¶

# Summary extraction template
name: summary
version: "1.0"
description: Generate a comprehensive episode summary

system_prompt: |
  You are an expert podcast analyst. Your task is to create
  a clear, concise summary of the podcast episode.

  Focus on:
  - Main topics discussed
  - Key takeaways
  - Notable insights

user_prompt_template: |
  Please summarize the following podcast transcript.

  Podcast: {{ metadata.podcast_name }}
  Episode: {{ metadata.episode_title }}
  Duration: {{ metadata.duration }}

  Transcript:
  {{ transcript }}

  Provide a 2-3 paragraph summary followed by 3-5 key takeaways.

expected_format: markdown
max_tokens: 2000
temperature: 0.3

# Category-specific configuration
applies_to:
  - all

priority: 0

Pros¶

✅ Highly Readable: Clear, clean syntax ✅ Comments: Native support with # ✅ Multi-line Strings: Excellent support with | and > ✅ No Quotes Required: Simple values don't need quotes ✅ Wide Adoption: Used by many tools (Docker, Kubernetes, GitHub Actions) ✅ Editor Support: Excellent syntax highlighting and validation ✅ Complex Structures: Easy nesting and lists

Cons¶

❌ Indentation Sensitivity: Whitespace matters (can cause errors) ❌ Type Ambiguity: no becomes False, 1.0 might be string or number ❌ Security: yaml.unsafe_load() can execute code (mitigated with safe_load) ❌ Parsing Complexity: More complex parser than JSON ❌ Duplicate Keys: Silently overwrites (YAML spec allows it)

Use Cases¶

✅ Configuration files (most common use) ✅ CI/CD pipelines (GitHub Actions, GitLab CI) ✅ Kubernetes manifests ✅ Human-edited files

Validation Example¶

import yaml
from pydantic import BaseModel

# Load and validate
with open("template.yaml") as f:
    data = yaml.safe_load(f)
    template = ExtractionTemplate(**data)  # Pydantic validation

2. TOML (Tom's Obvious Minimal Language)¶

Example Template¶

# Summary extraction template
name = "summary"
version = "1.0"
description = "Generate a comprehensive episode summary"

system_prompt = """
You are an expert podcast analyst. Your task is to create
a clear, concise summary of the podcast episode.

Focus on:
- Main topics discussed
- Key takeaways
- Notable insights
"""

user_prompt_template = """
Please summarize the following podcast transcript.

Podcast: {{ metadata.podcast_name }}
Episode: {{ metadata.episode_title }}
Duration: {{ metadata.duration }}

Transcript:
{{ transcript }}

Provide a 2-3 paragraph summary followed by 3-5 key takeaways.
"""

expected_format = "markdown"
max_tokens = 2000
temperature = 0.3

applies_to = ["all"]
priority = 0

Pros¶

✅ Type Safety: Explicit types (strings, ints, floats, booleans) ✅ Comments: Native support with # ✅ Multi-line Strings: Triple quotes """ ✅ No Indentation Issues: Uses [sections] instead ✅ Simple Parser: Unambiguous syntax ✅ Growing Adoption: Python pyproject.toml, Rust Cargo.toml

Cons¶

❌ Verbosity: Requires quotes for all strings ❌ Limited Nesting: Awkward for deep hierarchies ❌ Less Familiar: Not as widely known as YAML/JSON ❌ Editor Support: Improving but not universal ❌ Complex Lists: Array of tables syntax is verbose

Use Cases¶

✅ Python projects (pyproject.toml) ✅ Rust projects (Cargo.toml) ✅ Configuration with strict types

Validation Example¶

import tomli  # Python 3.11+ has tomllib in stdlib
from pydantic import BaseModel

with open("template.toml", "rb") as f:
    data = tomli.load(f)
    template = ExtractionTemplate(**data)

3. JSON (JavaScript Object Notation)¶

Example Template¶

{
  "name": "summary",
  "version": "1.0",
  "description": "Generate a comprehensive episode summary",
  "system_prompt": "You are an expert podcast analyst. Your task is to create\na clear, concise summary of the podcast episode.\n\nFocus on:\n- Main topics discussed\n- Key takeaways\n- Notable insights",
  "user_prompt_template": "Please summarize the following podcast transcript.\n\nPodcast: {{ metadata.podcast_name }}\nEpisode: {{ metadata.episode_title }}\nDuration: {{ metadata.duration }}\n\nTranscript:\n{{ transcript }}\n\nProvide a 2-3 paragraph summary followed by 3-5 key takeaways.",
  "expected_format": "markdown",
  "max_tokens": 2000,
  "temperature": 0.3,
  "applies_to": ["all"],
  "priority": 0
}

Pros¶

✅ Standardized: RFC 8259 spec, universal support ✅ Fast Parsing: Very efficient parsers ✅ No Ambiguity: Strict syntax, no edge cases ✅ Editor Support: Universal syntax highlighting ✅ Validation: JSON Schema standard ✅ Language Agnostic: Works everywhere

Cons¶

❌ No Comments: Biggest limitation for templates ❌ Verbose: Requires quotes for keys and string values ❌ Multi-line Strings: Awkward with \n escapes ❌ Trailing Commas: Not allowed (causes errors) ❌ Human Readability: Less readable than YAML/TOML

Use Cases¶

✅ API responses ✅ Configuration generated by machines ✅ Data interchange ✅ Strict validation needed

Validation Example¶

import json
from pydantic import BaseModel

with open("template.json") as f:
    data = json.load(f)
    template = ExtractionTemplate(**data)

4. Python Dataclasses / Pydantic¶

Example Template¶

# templates/summary.py
from extraction.models import ExtractionTemplate

summary_template = ExtractionTemplate(
    name="summary",
    version="1.0",
    description="Generate a comprehensive episode summary",
    system_prompt="""
    You are an expert podcast analyst. Your task is to create
    a clear, concise summary of the podcast episode.

    Focus on:
    - Main topics discussed
    - Key takeaways
    - Notable insights
    """,
    user_prompt_template="""
    Please summarize the following podcast transcript.

    Podcast: {{ metadata.podcast_name }}
    Episode: {{ metadata.episode_title }}
    Duration: {{ metadata.duration }}

    Transcript:
    {{ transcript }}

    Provide a 2-3 paragraph summary followed by 3-5 key takeaways.
    """,
    expected_format="markdown",
    max_tokens=2000,
    temperature=0.3,
    applies_to=["all"],
    priority=0,
)

Pros¶

✅ Type Safety: Full Python type checking ✅ IDE Support: Autocomplete, refactoring, type hints ✅ Comments: Python docstrings and # comments ✅ Validation: Built-in with Pydantic ✅ Programmatic: Can compute values, import modules ✅ No Parsing: Native Python objects

Cons¶

❌ Requires Python Knowledge: Non-developers blocked ❌ Security Risk: Running user Python code is dangerous ❌ Harder to Edit: Need proper Python setup ❌ Version Control: Diffs less clear than declarative formats ❌ Not Portable: Can't easily share across languages

Use Cases¶

✅ Built-in templates (shipped with tool) ✅ Developer-only templates ✅ Complex logic (computed values)

Validation Example¶

# Already validated at definition time
from templates.summary import summary_template

Head-to-Head Comparison¶

Readability (1-10 scale)¶

Format	Score	Notes
YAML	9	Clean, minimal syntax
TOML	7	More verbose but clear
JSON	5	Quoted keys/values, escaped newlines
Python	8	Familiar to developers

User-Friendliness (1-10 scale)¶

Format	Score	Notes
YAML	9	Easy for non-developers
TOML	7	Requires learning
JSON	4	Painful for multi-line strings
Python	3	Requires Python knowledge

Validation Support (1-10 scale)¶

Format	Score	Notes
YAML	8	Pydantic + PyYAML
TOML	8	Pydantic + tomli
JSON	10	JSON Schema standard
Python	10	Native Pydantic

Editor Support (1-10 scale)¶

Format	Score	Notes
YAML	10	Universal support
TOML	7	Growing support
JSON	10	Universal support
Python	10	Best IDE support

Safety (1-10 scale)¶

Format	Score	Notes
YAML	7	safe_load required
TOML	9	Very safe
JSON	10	Completely safe
Python	3	Code execution risk

Decision Matrix¶

Weighted Scoring¶

Criteria	Weight	YAML	TOML	JSON	Python
Human Readable	25%	9	7	5	8
User-Friendly	25%	9	7	4	3
Comments Support	15%	10	10	0	10
Multi-line Strings	15%	10	9	3	10
Validation	10%	8	8	10	10
Safety	10%	7	9	10	3
Total	100%	8.8	7.9	4.6	6.7

Winner: YAML (8.8/10)

Real-World Template Examples¶

YAML Template (Recommended)¶

# tools-mentioned.yaml
name: tools-mentioned
version: "1.0"
description: Extract tools, frameworks, and libraries mentioned in tech podcasts

category: tech

system_prompt: |
  You are a technical expert analyzing a technology podcast.
  Extract all tools, frameworks, libraries, and technologies mentioned.

  For each tool, provide:
  - Name
  - Category (language, framework, library, tool, service)
  - Context (how it was discussed)
  - Timestamp (if mentioned)

user_prompt_template: |
  Analyze this tech podcast transcript and extract all mentioned tools.

  Transcript:
  {{ transcript }}

  Return a JSON array of tools with: name, category, context, timestamp.

expected_format: json
output_schema:
  type: object
  properties:
    tools:
      type: array
      items:
        type: object
        required: [name, category]
        properties:
          name:
            type: string
          category:
            type: string
            enum: [language, framework, library, tool, service, other]
          context:
            type: string
          timestamp:
            type: string
            pattern: "^\\d+:\\d+$"

applies_to:
  - tech
  - programming

priority: 5
model_preference: claude
max_tokens: 2000
temperature: 0.2

Comparison: Same Template in JSON¶

{
  "name": "tools-mentioned",
  "version": "1.0",
  "description": "Extract tools, frameworks, and libraries mentioned in tech podcasts",
  "category": "tech",
  "system_prompt": "You are a technical expert analyzing a technology podcast.\nExtract all tools, frameworks, libraries, and technologies mentioned.\n\nFor each tool, provide:\n- Name\n- Category (language, framework, library, tool, service)\n- Context (how it was discussed)\n- Timestamp (if mentioned)",
  "user_prompt_template": "Analyze this tech podcast transcript and extract all mentioned tools.\n\nTranscript:\n{{ transcript }}\n\nReturn a JSON array of tools with: name, category, context, timestamp.",
  "expected_format": "json",
  "output_schema": {
    "type": "object",
    "properties": {
      "tools": {
        "type": "array",
        "items": {
          "type": "object",
          "required": ["name", "category"],
          "properties": {
            "name": {"type": "string"},
            "category": {
              "type": "string",
              "enum": ["language", "framework", "library", "tool", "service", "other"]
            },
            "context": {"type": "string"},
            "timestamp": {
              "type": "string",
              "pattern": "^\\d+:\\d+$"
            }
          }
        }
      }
    }
  },
  "applies_to": ["tech", "programming"],
  "priority": 5,
  "model_preference": "claude",
  "max_tokens": 2000,
  "temperature": 0.2
}

YAML is clearly more readable and maintainable.

Edge Cases and Gotchas¶

YAML Edge Cases¶

1. Type Coercion

# Problem: These all become booleans
no: false  # Becomes False
yes: true  # Becomes True
on: true   # Becomes True
off: false # Becomes False

# Solution: Quote strings
no: "no"
yes: "yes"

2. Duplicate Keys

# Problem: Silently overwrites
name: first
name: second  # Wins, first is lost

# Solution: Use linter to detect

3. Indentation

# Problem: Wrong indentation breaks parsing
prompt: |
  Line 1
   Line 2  # Extra space breaks it

# Solution: Use consistent indentation (2 or 4 spaces)

Mitigation Strategies¶

Use yaml.safe_load() - Prevents code execution
Validate with Pydantic - Catch type errors early
YAML Linter - Pre-commit hook with yamllint
Schema Validation - Define expected structure
Good Documentation - Provide templates and examples

Recommendation¶

Primary Format: YAML¶

Rationale: 1. ✅ Best user experience - Non-developers can create templates 2. ✅ Comments support - Critical for documentation 3. ✅ Multi-line strings - Perfect for prompts 4. ✅ Wide adoption - Familiar to most developers 5. ✅ Excellent tooling - Linters, validators, editor support 6. ✅ Validated safely - Pydantic + safe_load

Mitigations: - Use yaml.safe_load() for security - Pydantic validation for correctness - YAML linting in pre-commit hooks - Clear documentation and examples - Template validation CLI command

Hybrid Approach¶

Built-in templates: Python (type-safe, version-controlled) User templates: YAML (user-friendly, editable)

# Load built-in template
from inkwell.templates.default import summary

# Load user template
loader.load_template("my-custom-template")  # Loads from ~/.config/inkwell/templates/

Implementation Plan¶

1. Template Loading¶

def load_template(path: Path) -> ExtractionTemplate:
    """Load and validate template from YAML file"""
    with open(path) as f:
        data = yaml.safe_load(f)

    # Pydantic validation
    template = ExtractionTemplate(**data)

    return template

2. Template Validation CLI¶

# Validate user template
inkwell template validate my-template.yaml

# Output:
# ✓ Template 'my-template' is valid
# - System prompt: 145 characters
# - User prompt template: 234 characters
# - Expected format: json
# - Output schema: valid

3. Template Creation Helper¶

# Create template from interactive prompt
inkwell template create tools-mentioned

# Guides user through:
# - Name, description
# - Category (optional)
# - System prompt
# - User prompt template
# - Expected format
# - Output schema (optional)

Conclusion¶

YAML is the clear winner for user-defined extraction templates: - Best balance of readability and functionality - Excellent comment and multi-line string support - Wide tooling and editor support - Safe with proper loading and validation

Implementation: 1. Use YAML for all user-editable templates 2. Validate with Pydantic models 3. Provide good documentation and examples 4. Add CLI tools for validation and creation 5. Use pre-commit hooks to catch errors early

Template Format Evaluation: YAML vs Alternatives¶

Overview¶

Requirements for Template Format¶

Functional Requirements¶

Non-Functional Requirements¶

Format Comparison¶

1. YAML (YAML Ain't Markup Language)¶

Example Template¶

Pros¶

Cons¶

Use Cases¶

Validation Example¶

2. TOML (Tom's Obvious Minimal Language)¶

Example Template¶

Pros¶

Cons¶

Use Cases¶

Validation Example¶

3. JSON (JavaScript Object Notation)¶

Example Template¶

Pros¶

Cons¶

Use Cases¶

Validation Example¶

4. Python Dataclasses / Pydantic¶

Example Template¶

Pros¶

Cons¶

Use Cases¶

Validation Example¶

Head-to-Head Comparison¶

Readability (1-10 scale)¶

User-Friendliness (1-10 scale)¶

Validation Support (1-10 scale)¶

Editor Support (1-10 scale)¶

Safety (1-10 scale)¶

Decision Matrix¶

Weighted Scoring¶

Real-World Template Examples¶

YAML Template (Recommended)¶

Comparison: Same Template in JSON¶

Edge Cases and Gotchas¶

YAML Edge Cases¶

Mitigation Strategies¶

Recommendation¶

Primary Format: YAML¶

Hybrid Approach¶

Implementation Plan¶

1. Template Loading¶

2. Template Validation CLI¶

3. Template Creation Helper¶

Conclusion¶

References¶