Introduction
In our previous articles, we covered basic RAG, retrieve-and-rerank, and validation techniques. This article introduces Hybrid RAG, which combines vector search with keyword-based search to improve retrieval accuracy and robustness.
To follow along, you should be familiar with the concepts covered in previous articles. The code changes focus on enhancing the retrieval pipeline by implementing hybrid search while maintaining the same validation framework.
All relevant code changes will be contained in a single commit in the GitHub repository.
Series Overview
This is part of the RAG Cookbook series:
- Introduction to RAG
- Retrieve and rerank RAG
- RAG validation (RAGProbe)
- Hybrid RAG (This Article)
- Graph RAG
- Multi-modal RAG
- Agentic RAG (Router)
- Agentic RAG (Multi-agent)
Table of Contents
Hybrid RAG Explained
Hybrid RAG combines dense retrieval (vector search) with sparse retrieval (keyword/BM25 search) to leverage the strengths of both approaches. This combination provides:
- Semantic Understanding: Vector search captures conceptual relationships
- Exact Matching: Keyword search catches specific terms and phrases
- Improved Robustness: Multiple retrieval methods reduce single-point failures
Why Hybrid Search
Single-method retrieval faces several limitations:
-
Vector Search Limitations:
- May miss exact keyword matches
- Semantic drift in edge cases
- Computationally intensive
-
Keyword Search Limitations:
- Misses semantic relationships
- Sensitive to vocabulary mismatch
- Limited understanding of context
Hybrid search addresses these issues by combining both approaches.
Search Components
The hybrid approach consists of three main components:
-
Dense Retrieval:
- Uses vector embeddings
- Captures semantic relationships
- Handles conceptual queries
-
Sparse Retrieval:
- Implements keyword matching
- Catches exact matches
- Handles specific terms
-
Result Fusion:
- Combines both result sets
- Deduplicates matches
- Reranks final results
Implementation
Our implementation extends the existing RAG system with hybrid search capabilities:
1. Vector Search
const vectorResults = await index.query({
vector: queryEmbedding,
topK: 15,
includeMetadata: true,
});
2. Keyword Search
const keywordResults = await index.query({
vector: new Array(1536).fill(0),
topK: 15,
includeMetadata: true,
filter: {
text: { $contains: query.toLowerCase() },
},
});
3. Result Combination
const allMatches = [...vectorResults.matches, ...keywordResults.matches];
const uniqueMatches = Array.from(
new Map(allMatches.map((match) => [match.id, match])).values(),
);
4. System Architecture
The implementation follows these key principles:
-
Parallel Processing
- Concurrent vector and keyword searches
- Efficient resource utilization
- Optimized response times
-
Result Fusion
- Smart deduplication
- Score normalization
- Weighted combination
-
Quality Control
- Relevance scoring
- Result diversity
- Context optimization
Conclusions
Hybrid RAG significantly improves retrieval quality by combining the strengths of vector and keyword search. While this adds some complexity, the benefits in robustness and accuracy make it a valuable enhancement for production systems.
The next article will explore Graph RAG, which adds relationship-aware retrieval to our system.