Retrieval starts with how you break documents into pieces (chunking) and how you represent those pieces numerically (embeddings). These decisions determine what your RAG system can and can't find.
Documents need to be split into chunks small enough to be relevant but large enough to contain useful context.
Split at H2 or H3 boundaries. Each section becomes a chunk. Works great for structured documents like your meeting notes (which have headers for topics, action items, etc.).
Split every N tokens with M tokens of overlap. Simple, works for any document. But you'll cut through sentences and ideas.
Use the LLM or embeddings to identify topic boundaries. More accurate but slower and more expensive.
For your Obsidian vault:
Your meeting notes already have structure — YAML frontmatter, headers, sections. Chunking by section (H2) is the obvious choice. Each chunk gets metadata from the frontmatter (date, project, attendees).
Meeting: "2025-12-12 Pricing strategy Comfama"
Chunk 1: { section: "Discussion", content: "...", date: "2025-12-12", project: "Comfama" }
Chunk 2: { section: "Action Items", content: "...", date: "2025-12-12", project: "Comfama" }
Chunk 3: { section: "Decisions", content: "...", date: "2025-12-12", project: "Comfama" }
Embeddings convert text into vectors — arrays of numbers that capture semantic meaning. Similar texts have similar vectors.
Popular embedding models:
text-embedding-3-small (OpenAI): Cheap, good enough for most cases, 1536 dimensionstext-embedding-3-large (OpenAI): Better quality, 3072 dimensions, 2x costvoyage-3 (Voyage AI): Strong on code and technical contentnomic-embed-text, bge-small — free, runs on your machineVector stores:
For your stack, Supabase pgvector is the obvious choice — no new service, you already have the client set up.
| Small chunks (100-200 tokens) | Large chunks (500-1000 tokens) |
|---|---|
| More precise retrieval | More context per chunk |
| Risk losing surrounding context | Risk retrieving irrelevant text |
| More chunks to search | Fewer chunks, faster search |
| Better for specific facts | Better for narrative/reasoning |
For meeting notes: 300-500 tokens per chunk is a good starting point. Each section of a meeting typically falls in this range naturally.
Time to consolidate what you learned.