QA That Scores Against Your Policies. Not Generic Benchmarks.

Generic QA tools score tone and grammar. They cannot tell you whether the agent gave the right answer.

RevelirQA retrieves your SOPs before every evaluation and scores what actually matters: did the agent apply your policies correctly.

See what policy-aware scoring looks like for your teamBook a Demo
0
generic benchmarks applied to your conversations
RAG
retrieves your policies before every evaluation
100%
of tickets scored against your actual SOPs

Why Do Generic QA Rubrics Miss the Most Expensive Errors?

Politeness. Tone. Grammar. Those are what generic rubrics measure. They miss the failures that create real business risk: an agent quoting the wrong cancellation fee, promising a refund the policy does not allow, or using eligibility criteria that changed last month.

Catching accuracy errors requires knowing your policies. Generic tools do not have them. RevelirQA does.

How Does RevelirQA Score Against Your Own Policies?

Three stages run before any score is given:

  • Ingest: your knowledge base, SOPs, escalation procedures, and product policies are loaded into a vector database
  • Retrieve: when a conversation is scored, the engine pulls the specific documents for that ticket's contact reason
  • Score: the rubric runs against those retrieved documents, not a general template

What Does Policy-Aware Scoring Catch That Generic Scoring Misses?

Error typeGeneric QA rubricRevelirQA policy-aware scoring
Wrong refund amount quotedNot detectedFlagged; retrieved policy cited in the trace
Incorrect eligibility criteria givenNot detectedFlagged against the relevant SOP
Missed required regulatory disclosureNot detectedFlagged against compliance documentation
Wrong escalation path takenNot detectedFlagged against escalation procedure
Agent tone below standardAssessedAssessed
Response clarityAssessedAssessed

What Is in the Audit Trail on Every Score?

Every RevelirQA score includes the model used, the exact prompt, the documents retrieved from the vector database, and the reasoning per criterion. Available on every individual evaluation.

For fintech and regulated industries, this is the documentation that shows your QA process is grounded in your actual compliance requirements. RevelirQA is in production at Xendit, an Indonesian fintech, where this level of auditability is a compliance requirement.

What Happens to Scoring When Policies Change?

Policy updates flow through automatically. When your team publishes a new SOP or updates a product policy, the scoring engine uses the updated document on the next relevant evaluation. No manual rubric reconfiguration. No lag.

"We have manually reviewed tickets for years. Revelir is the first product that has made AI ticket review at scale actually usable."
Rendy D., Tiket.com
See what policy-aware scoring looks like for your team
Book a Demo

Frequently Asked Questions

What format does our knowledge base need to be in?

Revelir works with most standard documentation formats. During onboarding, the team connects your existing knowledge sources to the vector database.

How is RAG-based scoring different from keyword matching?

Keyword matching checks whether specific words appear in a conversation. RAG-based scoring evaluates the meaning and accuracy of what the agent said against the substance of your policies. An agent can use all the right words while giving a wrong answer. Policy-aware scoring catches that. Keyword matching does not.

Can different contact reasons use different rubric weightings?

Yes. The rubric can apply different weights per contact type. A billing dispute and a general FAQ inquiry do not need to be scored identically.

What is RAG and why does it matter for QA scoring?

RAG stands for retrieval-augmented generation. Before generating a score, the system retrieves specific documents from your knowledge base. The evaluation is grounded in your actual policies, not the model's general training data. Retrieved documents are cited in the audit trail on every score.

Does policy-aware scoring work for multilingual support teams?

Yes. RevelirQA scores conversations in English, Indonesian, Thai, and Tagalog in production. The same policy-aware rubric applies across all languages.

About RevelirQA

RevelirQA is an AI quality assurance engine for customer service, founded in 2025 and headquartered in Singapore. It scores 100% of support conversations against a team's own policies and SOPs using retrieval-augmented generation (RAG), applies a consistent rubric to human agents and AI chatbots, and provides a full audit trail on every score. In production at Xendit (Indonesian fintech) and Tiket.com (Indonesian travel). Multilingual scoring in English, Indonesian, Thai, and Tagalog. Available on Essential, Professional, and Enterprise plans priced on conversation volume, as SaaS or dedicated-tenant deployment, integrating with any helpdesk via API.

💬