Retrieval & Reranking
Find the right documents and rank them accurately
🔑 Key Concepts
- Hybrid search — Combine BM25 (keyword match) + vector search (semantic match). Most vector DBs support this natively.
- Cohere Rerank — Cross-encoder that scores query-document pairs more accurately than embeddings. Single biggest quality upgrade.
- Top-K tuning — Retrieve 20-50, rerank to top 3-5 for LLM context. More than 5 chunks dilutes the signal.
- MMR — Maximal Marginal Relevance — balances relevance with diversity. Prevents retrieving 5 near-identical chunks.
💡 Practice: Try implementing each concept yourself before moving on. Reading about RAG and building RAG are very different things.