Learn RAG from Scratch

RAG Multi-Query, HyDE & Fusion (Complete Guide)

Video 8 of 9 · 6:31

Chapters

  • 0:00The problem with vague questions
  • 0:50Multi-query: the hero technique
  • 2:00Multi-query before and after
  • 2:50RAG-Fusion, decomposition, step-back, HyDE

Transcript

Auto generated by YouTube. Click any timestamp to jump to that moment.

Show
  1. 0:03You wire up the search. Then a
  2. 0:06types, "How do I set up Oth and
  3. 0:08system returns the billing FAQ?"
  4. 0:11because your embeddings are bad,
  5. 0:13the question is too vague. O
  6. 0:16mean OOTH tokens, session
  7. 0:19API keys, or SSO
  8. 0:22Your vector database
  9. 0:25know which one the user means.
  10. 0:27it guesses and it guesses wrong.
  11. 0:30is the most common failure in
  12. 0:32rag. The user's question
  13. 0:34match how your documents
  14. 0:36the same concept. Query
  15. 0:39fixes this. Instead of
  16. 0:41with the vague question
  17. 0:43you transform it first into
  18. 0:45that actually match your stored
  19. 0:47Multi-query is the single most
  20. 0:51query translation technique.
  21. 0:54is how it works. A user asks, "How
  22. 0:57I set up O?" The LLM takes that vague
  23. 1:00and rewrites it into three
  24. 1:03versions. Version one, what
  25. 1:06protocols does the API
  26. 1:08such as OOTH 2 or API keys?
  27. 1:12two, how do users log in and
  28. 1:14session tokens? Version three,
  29. 1:17is the step-by-step process for
  30. 1:19SSO with a third-party
  31. 1:22provider? Each version targets
  32. 1:24different aspect of authentication,
  33. 1:27each one retrieves different
  34. 1:28Version one pulls the API
  35. 1:31Version two pulls the login
  36. 1:34guide. Version three pulls the SSO
  37. 1:37walkthrough. You merge all three
  38. 1:40sets. Dduplicate any overlapping
  39. 1:43Now you have comprehensive
  40. 1:45that no single query could have
  41. 1:48The vague question retrieved
  42. 1:50wrong document. The three specific
  43. 1:53retrieved exactly the right
  44. 1:55Let's see the actual difference.
  45. 1:59query translation, you search
  46. 2:01how do I set up off the vector
  47. 2:04returns your billing FAQ at
  48. 2:07similarity, the pricing page at
  49. 2:1168 and the general overview and
  50. 2:16of these are what the user needs.
  51. 2:19with multi-query, the same question
  52. 2:22three targeted searches. The
  53. 2:25results include the OOTH setup
  54. 2:28at 0.94,
  55. 2:30session management docs at 0.91, and
  56. 2:34SSO configuration walkthrough at
  57. 2:38relevant documents instead of
  58. 2:40The retrieval quality jumps from
  59. 2:43to production ready. That is why
  60. 2:46should be your default query
  61. 2:48strategy. It is simple to
  62. 2:51works with any LLM and
  63. 2:54improves retrieval for
  64. 2:56questions. If you want to
  65. 2:59how to do this yourself, I run
  66. 3:02live sessions every Friday at noon
  67. 3:05Scan the QR code on screen to
  68. 3:08would love to see you there. Hide
  69. 3:13the approach. Instead of rewriting
  70. 3:16question, you generate a fake
  71. 3:18The user asks, "How do I set up
  72. 3:21The LLM writes a hypothetical
  73. 3:24something like, "To configure
  74. 3:27first register your
  75. 3:29in the OOTH dashboard, then
  76. 3:32client credentials and
  77. 3:34the token exchange flow." That
  78. 3:37answer is not real, but it
  79. 3:40like your actual documentation.
  80. 3:42embed that fake answer and search
  81. 3:45real documents with similar
  82. 3:47The intuition is simple. A
  83. 3:50answer is closer in
  84. 3:52space to the real answer than
  85. 3:54original vague question was. Hide
  86. 3:57best when your documents follow
  87. 3:59formatting. If your docs are
  88. 4:01the hypothetical answer might not
  89. 4:04any real document style.
  90. 4:07questions are not vague. They are
  91. 4:10Compare the authentication
  92. 4:12and recommend the best one for a
  93. 4:15SAS app. That is actually
  94. 4:18questions. What authentication
  95. 4:20exist? What are the trade-offs
  96. 4:23each? Which one fits a multi-tenant
  97. 4:27breaks the complex
  98. 4:29into sub questions. Each sub
  99. 4:32gets answered independently.
  100. 4:35the results get combined into a
  101. 4:37answer. Step back prompting takes
  102. 4:40opposite approach. Instead of
  103. 4:42down, it zooms out. How do I
  104. 4:45up off becomes what are the general
  105. 4:48of web application
  106. 4:51broader question retrieves
  107. 4:53context that helps the LLM
  108. 4:56about the specific question. Both
  109. 4:58handle edge cases that
  110. 5:00misses. Use decomposition
  111. 5:04complex multi-part questions. Use
  112. 5:07when the user needs background
  113. 5:09to get a good answer.
  114. 5:13is how to decide which technique to
  115. 5:15Start with multiquery as your
  116. 5:18It handles the most common
  117. 5:20vague questions, and it is the
  118. 5:23to implement. If your documents
  119. 5:26consistent formatting and your
  120. 5:28ask short questions, add hide. The
  121. 5:32answer approach works well
  122. 5:34clean structured docs. For
  123. 5:36where users ask complex
  124. 5:38questions, add decomposition.
  125. 5:42the question down. Answer each
  126. 5:44Combine the results. And when
  127. 5:47need deep background context to
  128. 5:49a useful answer, use step back
  129. 5:51In practice, most production
  130. 5:54start with multi-query and only
  131. 5:56the others when they see specific
  132. 5:58patterns. Start simple. Measure
  133. 6:01fails. Add complexity only where it
  134. 6:05That's the full picture. If you
  135. 6:08to go deeper, join my free live
  136. 6:10this Friday at noon Eastern on
  137. 6:13I walk through this hands-on,
  138. 6:16questions, and show you how to
  139. 6:18it yourself. Scan the QR code to

Want the next one in your inbox?

Join 1,000+ Product Managers getting one deep dive every Friday.