Learn RAG from Scratch

Hybrid Search for RAG: BM25 + Vector Search Explained

Video 6 of 9 · 8:12

Chapters

  • 0:00The problem with pure vector search
  • 0:45How BM25 keyword search works
  • 1:45How vector semantic search works
  • 2:40Combining both with RRF

Transcript

Auto generated by YouTube. Click any timestamp to jump to that moment.

Show
  1. 0:03"What does error code E412 mean.
  2. 0:07Vector database returns five
  3. 0:09about error handling best
  4. 0:11Helpful articles." None of
  5. 0:14mention E412.
  6. 0:16actual answer is sitting right there
  7. 0:18your docs. But Vector Search missed
  8. 0:21because it searched by meaning, not
  9. 0:24the exact code. This is the blind
  10. 0:26of semantic search. It understands
  11. 0:29but fumbles exact matches.
  12. 0:32names, SKUs, error codes,
  13. 0:35numbers. The fix is hybrid
  14. 0:38Combine vector search with
  15. 0:40search and get the best of both.
  16. 0:44search uses an algorithm called
  17. 0:49is how it works. Take the query
  18. 0:52code E412.
  19. 0:55breaks it into individual terms.
  20. 0:59code E412.
  21. 1:03each term, it asks two questions.
  22. 1:06often does this term appear in this
  23. 1:08That is term frequency. And
  24. 1:12rare is this term across all
  25. 1:14That is inverse document
  26. 1:18words like error appear
  27. 1:20So they get a low IDF score.
  28. 1:24E412 only appears in one document,
  29. 1:27it gets a high IDF score. BM25
  30. 1:32frequency by rarity. The
  31. 1:35that literally contains E412
  32. 1:38highest. No understanding of
  33. 1:41required, just exact text
  34. 1:44That is what makes keyword
  35. 1:46perfect for specific identifiers.
  36. 1:50search works differently. Instead
  37. 1:53matching exact words, it converts
  38. 1:56into numerical embeddings that
  39. 1:58meaning. How do I get a refund
  40. 2:01return policy for my purchase
  41. 2:03zero overlapping keywords, but
  42. 2:06mean the same thing. Vector search
  43. 2:09that because both sentences map to
  44. 2:11points in embedding space. The
  45. 2:14of vector search is
  46. 2:16intent. synonyms,
  47. 2:19even different languages
  48. 2:21the same concept all land
  49. 2:24each other. But ask for skew 7829
  50. 2:27vector search treats it as a random
  51. 2:29It has no idea that exact match
  52. 2:33Hybrid search runs both
  53. 2:36in parallel. Same query goes
  54. 2:38BM25 and to the vector index at the
  55. 2:42time. Each returns its own ranked
  56. 2:44The keyword path finds exact
  57. 2:47The vector path finds semantic
  58. 2:50Now you have two lists of
  59. 2:52Some documents appear in both
  60. 2:55some only in one. The question
  61. 2:58how do you merge them into a single
  62. 3:00list? You need a fusion
  63. 3:02The most common one is
  64. 3:05rank fusion or RRF. Let me
  65. 3:08you exactly how it works. If you
  66. 3:11to learn how to do this yourself, I
  67. 3:14free live sessions every Friday at
  68. 3:17Eastern. Scan the QR code on screen
  69. 3:21join. Would love to see you there.
  70. 3:25is the RRF formula. Score equals 1
  71. 3:29K plus rank. K is a constant, usually
  72. 3:34That is it. One line of math. Let me
  73. 3:38through a real example. Three
  74. 3:40doc A, doc B, doc C. Keyword
  75. 3:45ranks them. A is first, B is
  76. 3:47C is third. Vector search ranks
  77. 3:50differently. C is first, A is
  78. 3:53B is third. Now apply RRF to
  79. 3:56document. Doc A keyword rank one
  80. 4:00is 1 over 60 + 1. So 1 over 61
  81. 4:04is 0.0164.
  82. 4:07rank two score is 1 over 62 which
  83. 4:100.0161.
  84. 4:13them together. Doc A total is
  85. 4:18B keyword rank two gives 1 over 62
  86. 4:24rank three gives 1 over 63
  87. 4:290.0320.
  88. 4:32C keyword rank 3 gives 1 over 63
  89. 4:350159
  90. 4:38rank one gives 1 over 61 0.0164
  91. 4:430.0323
  92. 4:46ranking by RRF score doc A first
  93. 4:490.0325
  94. 4:51C second at 0.0323
  95. 4:54B third at 0.0320.
  96. 4:58what happened. Doc A ranked high
  97. 5:00both lists and came out on top. Doc C
  98. 5:03first in vector but third in
  99. 5:05so it landed second. The formula
  100. 5:09documents that rank well across
  101. 5:11methods. A document that is number
  102. 5:13in one list but missing from the
  103. 5:15will still score lower than a
  104. 5:17that ranks in the top three of
  105. 5:19Let me show you two queries where
  106. 5:23approach fails and the other
  107. 5:25Query one, cancel subscription
  108. 5:29plan pro 500.
  109. 5:32search finds the exact document
  110. 5:34it matches Pro- 500, literally.
  111. 5:38search returns generic
  112. 5:40guides that never mention
  113. 5:42500. Hybrid search returns the
  114. 5:46document because the keyword path
  115. 5:48it. Query two, how do I undo a
  116. 5:52purchase? Keyword search finds
  117. 5:54useful because no document
  118. 5:57the word undo. Vector search
  119. 6:00that undo a purchase means
  120. 6:02and finds the refund policy.
  121. 6:06search returns the refund policy
  122. 6:09the vector path understood the
  123. 6:11Two queries, two different
  124. 6:14modes. Hybrid search handles
  125. 6:16because it always has two shots at
  126. 6:19the right answer.
  127. 6:22do not have to give both methods
  128. 6:24weight. Most hybrid search systems
  129. 6:27you set an alpha parameter. Alpha.5
  130. 6:31equal weight. Keyword and vector
  131. 6:33contribute half. Shift alpha toward
  132. 6:36for more vector weight. This works
  133. 6:39when your users ask natural
  134. 6:41questions and rarely use
  135. 6:43identifiers. Shift alpha
  136. 6:460.0 for more keyword weight.
  137. 6:49is better when your corpus has lots
  138. 6:51codes, IDs, and technical terms that
  139. 6:54match exactly. In practice, start
  140. 6:570.5 and test with real queries. Track
  141. 7:01queries return bad results and
  142. 7:03Most teams end up between 0.3
  143. 7:070.7 depending on their data. Here is
  144. 7:11full picture. Keyword search matches
  145. 7:14terms. Vector search matches
  146. 7:17Hybrid search runs both in
  147. 7:20and merges the results with
  148. 7:22You get coverage for exact
  149. 7:25and semantic intent in a
  150. 7:28query. Start with equal weights,
  151. 7:31based on your data and ship it.
  152. 7:34major vector database supports
  153. 7:36search out of the box. Pine cone,
  154. 7:39Qrant, Elastic Search. Turn it on
  155. 7:44your retrieval gets better
  156. 7:46That's the full picture. If
  157. 7:49want to go deeper, join my free live
  158. 7:52this Friday at noon Eastern on
  159. 7:54I walk through this hands-on,
  160. 7:57questions, and show you how to
  161. 7:59it yourself. Scan the QR code to

Want the next one in your inbox?

Join 1,000+ Product Managers getting one deep dive every Friday.