Blog
Research 3 min read

The critic agent: how RenBase verifies its own answers

Before you see an answer, a second model checks it for faithfulness, completeness, and contradictions. Here's how the critic agent works.

RT
RenBase Team
critic-agent reliability hallucination verification

LLMs hallucinate. This is a known property of the technology, not a bug that will be patched in the next release. The question for any system built on top of LLMs is: what do you do about it?

Our answer is the critic agent, a second model whose sole job is to review answers before they reach you.

What the critic checks

After the synthesis model produces an answer, the critic agent receives:

  1. The original question
  2. The draft answer
  3. All source passages that were retrieved

It runs three checks:

Faithfulness

For every factual claim in the answer, the critic verifies that the claim is supported by at least one cited passage. If a claim appears in the answer but cannot be traced to any source, the critic marks it as unsupported.

Unsupported claims trigger a re-retrieval loop: the system fetches additional passages and attempts to either verify the claim or remove it from the answer.

Completeness

The critic checks whether all parts of the original question have been addressed. For decomposed questions (multi-hop queries), it verifies that each sub-question has a corresponding answer.

If a sub-question is unanswered, usually because retrieval found no relevant passages, the critic adds a note to the answer: “No relevant information found in your knowledge base for: [sub-question].”

Contradiction detection

Sometimes two source passages say different things about the same fact. This happens frequently with contracts that have been amended, or policies that have been updated without the old version being deleted.

The critic checks for inter-source contradictions. When it finds one, instead of silently choosing the more recent or higher-confidence passage, it surfaces both in the answer with their respective sources, flagging the contradiction explicitly.

When the critic disagrees

If the critic finds significant issues (more than one unsupported claim, or a completeness score below threshold) it rejects the draft answer and triggers a new retrieval-synthesis cycle with modified queries.

In practice, first-pass rejection happens on about 12% of queries. Of those, the second pass resolves the issues ~90% of the time. About 1% of queries end up returning a “low confidence” answer with explicit caveats rather than a confident synthesised response.

We think this is the right trade-off. A system that tells you it’s uncertain is more useful than one that confidently gives you the wrong answer.

Why a separate model

We could run the critic checks as a second pass using the same synthesis model. We don’t, for two reasons.

First, the synthesis model has a systematic bias toward the answer it just produced. Asking it to critique its own output is less reliable than asking a separate model with no attachment to the draft.

Second, the critic needs to be fast. We run a smaller, fine-tuned verification model specifically for the three checks above. It adds ~200ms to end-to-end latency, a worthwhile trade for the reliability improvement.

Mandatory citations

Every claim in a RenBase answer is linked to a specific passage in a specific document. This is not optional. The synthesis model is instructed to produce no claims without citations, and the critic will reject answers that include uncited content.

This means you can always trace an answer back to its source and verify that the source actually says what the answer claims.

frequently asked questions

What is the critic agent in RenBase?

The critic agent is a secondary model that reviews every answer before it is returned to the user. It checks whether the answer is faithful to the cited sources, whether all parts of the question are addressed, and whether any cited sources contradict each other.

Does the critic agent prevent hallucinations?

The critic agent significantly reduces hallucinations by verifying that every claim in the answer is supported by a cited passage. If a claim cannot be verified, the critic flags it and triggers a re-retrieval loop.

How does the critic agent handle contradictions between sources?

When the critic detects contradicting sources, say two documents with different liability cap figures, it flags the contradiction in the answer and presents both figures with their respective sources, rather than silently choosing one.

related posts