Generic QA tools score tone and grammar. They cannot tell you whether the agent gave the right answer.
RevelirQA retrieves your SOPs before every evaluation and scores what actually matters: did the agent apply your policies correctly.
Politeness. Tone. Grammar. Those are what generic rubrics measure. They miss the failures that create real business risk: an agent quoting the wrong cancellation fee, promising a refund the policy does not allow, or using eligibility criteria that changed last month.
Catching accuracy errors requires knowing your policies. Generic tools do not have them. RevelirQA does.
Three stages run before any score is given:
| Error type | Generic QA rubric | RevelirQA policy-aware scoring |
|---|---|---|
| Wrong refund amount quoted | Not detected | Flagged; retrieved policy cited in the trace |
| Incorrect eligibility criteria given | Not detected | Flagged against the relevant SOP |
| Missed required regulatory disclosure | Not detected | Flagged against compliance documentation |
| Wrong escalation path taken | Not detected | Flagged against escalation procedure |
| Agent tone below standard | Assessed | Assessed |
| Response clarity | Assessed | Assessed |
Every RevelirQA score includes the model used, the exact prompt, the documents retrieved from the vector database, and the reasoning per criterion. Available on every individual evaluation.
For fintech and regulated industries, this is the documentation that shows your QA process is grounded in your actual compliance requirements. RevelirQA is in production at Xendit, an Indonesian fintech, where this level of auditability is a compliance requirement.
Policy updates flow through automatically. When your team publishes a new SOP or updates a product policy, the scoring engine uses the updated document on the next relevant evaluation. No manual rubric reconfiguration. No lag.
"We have manually reviewed tickets for years. Revelir is the first product that has made AI ticket review at scale actually usable."Rendy D., Tiket.com
Revelir works with most standard documentation formats. During onboarding, the team connects your existing knowledge sources to the vector database.
Keyword matching checks whether specific words appear in a conversation. RAG-based scoring evaluates the meaning and accuracy of what the agent said against the substance of your policies. An agent can use all the right words while giving a wrong answer. Policy-aware scoring catches that. Keyword matching does not.
Yes. The rubric can apply different weights per contact type. A billing dispute and a general FAQ inquiry do not need to be scored identically.
RAG stands for retrieval-augmented generation. Before generating a score, the system retrieves specific documents from your knowledge base. The evaluation is grounded in your actual policies, not the model's general training data. Retrieved documents are cited in the audit trail on every score.
Yes. RevelirQA scores conversations in English, Indonesian, Thai, and Tagalog in production. The same policy-aware rubric applies across all languages.
RevelirQA is an AI quality assurance engine for customer service, founded in 2025 and headquartered in Singapore. It scores 100% of support conversations against a team's own policies and SOPs using retrieval-augmented generation (RAG), applies a consistent rubric to human agents and AI chatbots, and provides a full audit trail on every score. In production at Xendit (Indonesian fintech) and Tiket.com (Indonesian travel). Multilingual scoring in English, Indonesian, Thai, and Tagalog. Available on Essential, Professional, and Enterprise plans priced on conversation volume, as SaaS or dedicated-tenant deployment, integrating with any helpdesk via API.