Blog
Research 2 min read

How multi-hop reasoning works: decompose, retrieve, synthesize

A deep dive into the chain-of-thought architecture that lets RenBase answer questions no single document can answer alone.

RT
RenBase Team
reasoning architecture retrieval RAG

Most document retrieval systems work in a straight line: you ask a question, the system finds the closest passage, and an LLM writes a response. This works for simple, direct queries. It fails for everything else.

Real knowledge work is rarely a single lookup. “What does our SLA say about the penalty if the Meridian deployment is delayed past the Q2 milestone and we already invoked the force majeure clause last year?” That question requires reading two different sections of two different documents and reasoning about their interaction.

That’s multi-hop reasoning. Here’s how RenBase does it.

Step 1: Decompose

When a query arrives, a planner model breaks it into sub-questions, each one answerable from a single source. The original question above becomes:

  • What does the SLA say about delay penalties past the Q2 milestone?
  • Was a force majeure clause invoked in the Meridian contract? When?
  • Does the force majeure invocation affect the penalty calculation?

Decomposition is the hardest part. The planner has to understand the implicit structure of the question without any retrieval context yet. We fine-tune this on a mix of legal, technical, and business document corpora.

Step 2: Retrieve iteratively

Each sub-question is treated as an independent retrieval query. RenBase uses two retrieval mechanisms in parallel:

Semantic search over dense embeddings finds passages that are semantically close to the sub-question, even if the exact vocabulary differs.

Knowledge graph traversal over our Neo4j graph finds related entities and their connections. This is especially useful when the answer isn’t in the text but in the relationship between concepts.

Results from both are merged, re-ranked by relevance, and deduplicated.

Step 3: Synthesize

Once all sub-questions have evidence, a synthesis model assembles them into a single coherent response. It traces every claim back to its source passage and flags any contradictions found across documents.

The output is never just a summary. It’s a structured answer with citations at the claim level.

Step 4: Verify

Before the answer reaches you, our critic agent checks it for:

  • whether the answer follows from the cited passages
  • whether all sub-questions were addressed
  • whether any cited passages contradict each other

If any check fails, the critic triggers a re-retrieval loop with corrected queries. This self-correction loop is why RenBase answers tend to be reliable even on ambiguous or edge-case questions.

Why this matters in practice

A legal team using RenBase to review 200 contracts doesn’t need to read each one. They ask: “Which contracts have liability caps below €100k?” That question requires reading a specific clause in each of 200 documents. Multi-hop reasoning handles the decomposition, retrieval, and synthesis automatically, returning a table of contracts with their cap amounts and source references.

The same architecture works for engineering post-mortems, compliance audits, and research synthesis. Any task that used to require a human analyst reading across a document corpus is now a single API call.

frequently asked questions

What is multi-hop reasoning in document retrieval?

Multi-hop reasoning is the ability to answer a question that requires evidence from multiple documents or passages. Instead of matching a query to a single chunk, the system decomposes the question, retrieves evidence iteratively, and synthesizes a final answer.

How does RenBase differ from standard RAG?

Standard RAG retrieves the top-k chunks closest to the query and feeds them to an LLM. RenBase decomposes complex questions into sub-questions, retrieves evidence for each independently, checks for contradictions via a critic agent, and then synthesizes a single sourced answer.

Can RenBase answer questions that span multiple documents?

Yes. Multi-hop reasoning is specifically designed for cross-document questions, like comparing liability caps across several contracts, or correlating a policy with its implementation log.

related posts