← Learn RAG from Scratch
Hybrid Search for RAG: BM25 + Vector Search Explained
Video 6 of 9 · 8:12
Chapters
- 0:00The problem with pure vector search
- 0:45How BM25 keyword search works
- 1:45How vector semantic search works
- 2:40Combining both with RRF
Transcript
Auto generated by YouTube. Click any timestamp to jump to that moment.
Show
Transcript
Auto generated by YouTube. Click any timestamp to jump to that moment.
- 0:03"What does error code E412 mean.
- 0:07Vector database returns five
- 0:09about error handling best
- 0:11Helpful articles." None of
- 0:14mention E412.
- 0:16actual answer is sitting right there
- 0:18your docs. But Vector Search missed
- 0:21because it searched by meaning, not
- 0:24the exact code. This is the blind
- 0:26of semantic search. It understands
- 0:29but fumbles exact matches.
- 0:32names, SKUs, error codes,
- 0:35numbers. The fix is hybrid
- 0:38Combine vector search with
- 0:40search and get the best of both.
- 0:44search uses an algorithm called
- 0:49is how it works. Take the query
- 0:52code E412.
- 0:55breaks it into individual terms.
- 0:59code E412.
- 1:03each term, it asks two questions.
- 1:06often does this term appear in this
- 1:08That is term frequency. And
- 1:12rare is this term across all
- 1:14That is inverse document
- 1:18words like error appear
- 1:20So they get a low IDF score.
- 1:24E412 only appears in one document,
- 1:27it gets a high IDF score. BM25
- 1:32frequency by rarity. The
- 1:35that literally contains E412
- 1:38highest. No understanding of
- 1:41required, just exact text
- 1:44That is what makes keyword
- 1:46perfect for specific identifiers.
- 1:50search works differently. Instead
- 1:53matching exact words, it converts
- 1:56into numerical embeddings that
- 1:58meaning. How do I get a refund
- 2:01return policy for my purchase
- 2:03zero overlapping keywords, but
- 2:06mean the same thing. Vector search
- 2:09that because both sentences map to
- 2:11points in embedding space. The
- 2:14of vector search is
- 2:16intent. synonyms,
- 2:19even different languages
- 2:21the same concept all land
- 2:24each other. But ask for skew 7829
- 2:27vector search treats it as a random
- 2:29It has no idea that exact match
- 2:33Hybrid search runs both
- 2:36in parallel. Same query goes
- 2:38BM25 and to the vector index at the
- 2:42time. Each returns its own ranked
- 2:44The keyword path finds exact
- 2:47The vector path finds semantic
- 2:50Now you have two lists of
- 2:52Some documents appear in both
- 2:55some only in one. The question
- 2:58how do you merge them into a single
- 3:00list? You need a fusion
- 3:02The most common one is
- 3:05rank fusion or RRF. Let me
- 3:08you exactly how it works. If you
- 3:11to learn how to do this yourself, I
- 3:14free live sessions every Friday at
- 3:17Eastern. Scan the QR code on screen
- 3:21join. Would love to see you there.
- 3:25is the RRF formula. Score equals 1
- 3:29K plus rank. K is a constant, usually
- 3:34That is it. One line of math. Let me
- 3:38through a real example. Three
- 3:40doc A, doc B, doc C. Keyword
- 3:45ranks them. A is first, B is
- 3:47C is third. Vector search ranks
- 3:50differently. C is first, A is
- 3:53B is third. Now apply RRF to
- 3:56document. Doc A keyword rank one
- 4:00is 1 over 60 + 1. So 1 over 61
- 4:04is 0.0164.
- 4:07rank two score is 1 over 62 which
- 4:100.0161.
- 4:13them together. Doc A total is
- 4:18B keyword rank two gives 1 over 62
- 4:24rank three gives 1 over 63
- 4:290.0320.
- 4:32C keyword rank 3 gives 1 over 63
- 4:350159
- 4:38rank one gives 1 over 61 0.0164
- 4:430.0323
- 4:46ranking by RRF score doc A first
- 4:490.0325
- 4:51C second at 0.0323
- 4:54B third at 0.0320.
- 4:58what happened. Doc A ranked high
- 5:00both lists and came out on top. Doc C
- 5:03first in vector but third in
- 5:05so it landed second. The formula
- 5:09documents that rank well across
- 5:11methods. A document that is number
- 5:13in one list but missing from the
- 5:15will still score lower than a
- 5:17that ranks in the top three of
- 5:19Let me show you two queries where
- 5:23approach fails and the other
- 5:25Query one, cancel subscription
- 5:29plan pro 500.
- 5:32search finds the exact document
- 5:34it matches Pro- 500, literally.
- 5:38search returns generic
- 5:40guides that never mention
- 5:42500. Hybrid search returns the
- 5:46document because the keyword path
- 5:48it. Query two, how do I undo a
- 5:52purchase? Keyword search finds
- 5:54useful because no document
- 5:57the word undo. Vector search
- 6:00that undo a purchase means
- 6:02and finds the refund policy.
- 6:06search returns the refund policy
- 6:09the vector path understood the
- 6:11Two queries, two different
- 6:14modes. Hybrid search handles
- 6:16because it always has two shots at
- 6:19the right answer.
- 6:22do not have to give both methods
- 6:24weight. Most hybrid search systems
- 6:27you set an alpha parameter. Alpha.5
- 6:31equal weight. Keyword and vector
- 6:33contribute half. Shift alpha toward
- 6:36for more vector weight. This works
- 6:39when your users ask natural
- 6:41questions and rarely use
- 6:43identifiers. Shift alpha
- 6:460.0 for more keyword weight.
- 6:49is better when your corpus has lots
- 6:51codes, IDs, and technical terms that
- 6:54match exactly. In practice, start
- 6:570.5 and test with real queries. Track
- 7:01queries return bad results and
- 7:03Most teams end up between 0.3
- 7:070.7 depending on their data. Here is
- 7:11full picture. Keyword search matches
- 7:14terms. Vector search matches
- 7:17Hybrid search runs both in
- 7:20and merges the results with
- 7:22You get coverage for exact
- 7:25and semantic intent in a
- 7:28query. Start with equal weights,
- 7:31based on your data and ship it.
- 7:34major vector database supports
- 7:36search out of the box. Pine cone,
- 7:39Qrant, Elastic Search. Turn it on
- 7:44your retrieval gets better
- 7:46That's the full picture. If
- 7:49want to go deeper, join my free live
- 7:52this Friday at noon Eastern on
- 7:54I walk through this hands-on,
- 7:57questions, and show you how to
- 7:59it yourself. Scan the QR code to
RAG Routing Explained: LLM vs Semantic Router (When to Use What)Next: Do not Ship RAG Without This (Evaluation Metrics)
Want the next one in your inbox?
Join 1,000+ Product Managers getting one deep dive every Friday.