Retrieval & Reranking

Find the right documents and rank them accurately

🔑 Key Concepts

Hybrid search — Combine BM25 (keyword match) + vector search (semantic match). Most vector DBs support this natively.
Cohere Rerank — Cross-encoder that scores query-document pairs more accurately than embeddings. Single biggest quality upgrade.
Top-K tuning — Retrieve 20-50, rerank to top 3-5 for LLM context. More than 5 chunks dilutes the signal.
MMR — Maximal Marginal Relevance — balances relevance with diversity. Prevents retrieving 5 near-identical chunks.

💡 Practice: Try implementing each concept yourself before moving on. Reading about RAG and building RAG are very different things.