Hallucination Detection
TruthVouch detects hallucinations — false or unsupported claims by LLMs — by comparing AI responses against your verified knowledge base. When hallucinations are detected, TruthVouch alerts you immediately with severity levels and suggested corrections.
The Detection Pipeline
Hallucination detection follows a 6-stage process:
Stage 1: Query Generation
For each truth nugget in your knowledge base, TruthVouch generates queries:
Example:
- Truth Nugget: “Founded in 2023”
- Generated Query: “When was TruthVouch founded?”
The system generates 3-5 query variants:
- Direct: “What is the founding year of TruthVouch?”
- Indirect: “Tell me about TruthVouch’s history”
- Factoid: “In what year did TruthVouch launch?”
- Comparative: “Was TruthVouch founded before or after 2024?”
Stage 2: LLM Querying
Query is sent to monitored LLMs (ChatGPT, Claude, Gemini, etc.):
Query: "When was TruthVouch founded?"↓LLM Response: "TruthVouch was founded in 2024"Stage 3: Entity Extraction
Response text is parsed to extract factual claims:
Response: "TruthVouch was founded in 2024"↓Extracted: entity="TruthVouch", relation="founded", value="2024"Uses Named Entity Recognition (NER) and relation extraction models.
Stage 4: Semantic Comparison
The extracted claim is compared to your truth nuggets using semantic analysis:
Truth: "Founded in 2023"LLM Claim: "Founded in 2024"Result: CONTRADICTION ✗The system evaluates three possible relationships:
- Match: Claim aligns with your verified truth
- Neutral: Claim neither confirms nor contradicts
- Contradiction: Claim directly conflicts with truth
Stage 5: Severity Assessment
The comparison result is mapped to alert severity levels based on confidence:
✓ Matches Truth → No alert⚠️ Partially Aligns → Warning alert✗ Contradicts → Critical alertYou can tune detection sensitivity in Dashboard → Settings → Detection Thresholds to match your risk tolerance (Standard, Strict, or Permissive presets).
Stage 6: Alerting
Based on severity and your alert rules:
HALLUCINATION detected ├─ Provider: ChatGPT ├─ Severity: Critical ├─ Claim: "Founded in 2024" ├─ Truth: "Founded in 2023" ├─ Confidence: 99% └─ Action: Send alert, prepare correctionAccuracy & Performance
TruthVouch achieves 94%+ detection accuracy across diverse claim types including factoids, entity attributes, relationships, and comparative statements. The system is calibrated to prioritize finding hallucinations while minimizing false positives, giving you confidence in every alert.
Detection Methods
Continuous Monitoring: TruthVouch automatically monitors your AI interactions against truth nuggets on a configurable schedule, checking all supported LLM providers.
On-Demand Verification: Manually verify specific claims or LLM responses at any time through the dashboard or API.
Scheduled Audits: Run periodic cross-checks on defined truth nugget categories (e.g., pricing, product features) to maintain trust posture.
Handling Edge Cases
Negations
Correctly handles negative claims:
Truth: "TruthVouch is not free"LLM: "TruthVouch costs money"NLI: ENTAILMENT (semantically equivalent)Result: ✓ CorrectParaphrasing
Detects when LLM paraphrases truth:
Truth: "Supports 9+ AI models"LLM: "Compatible with more than 8 LLM providers"NLI: ENTAILMENT (meaning preserved)Result: ✓ CorrectContext Dependency
Understands context-dependent statements:
Truth: "EU AI Act compliance available on Business plan"LLM: "TruthVouch offers EU AI Act compliance"NLI: NEUTRAL (context missing, could be true but vague)Result: ⚠️ Warning (incomplete, needs review)Temporal Claims
Handles time-sensitive information:
Truth: "Pricing updated January 2024"LLM: "TruthVouch costs $349/month"When checked in June: Previous price may have changedSystem: Rechecks with current truth nuggetsAlert Details
Each hallucination alert includes:
- Severity: Critical, High, Medium, or Low
- Confidence: How certain the detection is
- Original Claim: What the LLM said
- Verified Truth: What you’ve verified as correct
- Suggested Correction: Auto-generated accurate response
Limitations
TruthVouch hallucination detection has known limitations:
- Subjective Claims: Opinion-based statements are difficult to verify
- Temporal Sensitivity: Time-dependent facts require frequent updates
- Context: Some claims require broader context to evaluate
- Ambiguous Truth: If your truth nuggets are vague, detection is harder
- Domain Knowledge: Very specialized domains may have lower accuracy
Best Practices
Maintain Fresh Truth Nuggets: Review and update quarterly. Stale nuggets reduce detection accuracy.
Be Specific: Define clear, measurable truth nuggets. “Founded in 2023” detects better than “an established company.”
Monitor Alerts: Review low-confidence detections regularly to tune your detection sensitivity.
Adjust Thresholds: Use dashboard settings to balance sensitivity. Stricter = fewer alerts but might miss subtle hallucinations. Standard balances detection and false positives.
Next Steps
- Truth Nuggets: Learn how to create and organize your knowledge base
- Alerts & Corrections: Configure alert channels and auto-correction
- Dashboard: Monitor hallucination trends and detections in real-time