IIT Madras releases dataset to detect LLM bias and new AI evaluation tools
IIT Madras’s CeRAI has released IndiCASA, a 2,500‑sentence dataset to test LLM bias in India, alongside a conversational AI evaluation tool, PolicyBot and an AI incident‑reporting paper—part of a wider push for rigorous, transparent AI assessments.
The Indian Institute of Technology (IIT) Madras has unveiled IndiCASA, a new dataset designed to detect and assess bias in large language models (LLMs) in the Indian context, and has accompanied it with tools for standardised evaluation of conversational AI.
The launches were made during the ‘Conclave on AI Governance’ hosted on campus on 7 October 2025.
CeRAI, the Centre for Responsible AI at IIT Madras, has built IndiCASA, comprising around 2,500 human‑validated sentences pairing stereotypical and anti‑stereotypical statements across caste, gender, religion, disability and socioeconomic status—to help developers probe and quantify bias in LLMs trained for Indian users.
The dataset has been constructed via a human–AI collaborative approach to improve coverage and reliability.
Standardised evaluation tools have accompanied the dataset
Alongside IndiCASA, the institute has introduced an AI evaluation tool that connects directly to conversational agents to simulate human interactions, aiming to provide consistent, transparent and scalable assessments of system behaviour.
The conclave also saw the release of PolicyBot—an open‑source chatbot to navigate complex legal and policy documents—and a discussion paper proposing an AI incident‑reporting framework for India.
Bias evaluation resources rooted in India’s social realities have been sparse. IndiCASA has been positioned to fill that gap, giving researchers and practitioners a way to test and reduce harms that can arise when models reflect stereotypes not representative of India’s diversity.
Abhishek Singh, CEO of the IndiaAI Mission, has said India could help set global standards for AI governance—an ambition that heightens the significance of locally grounded evaluation datasets and tools.
How the dataset has been built
IndiCASA has been compiled with human experts validating prompts and counter‑prompts generated through a collaborative pipeline, yielding pairs that surface both biased and counter‑biased cues.
This structure has been intended to support comparative testing and risk assessment across sensitive categories frequently implicated in model bias.
The launches have come from IIT Madras’s Wadhwani School of Data Science and AI (WSAI) and CeRAI. Professor B. Ravindran, Founding Head of WSAI, has overseen the effort and has described the evaluation tool’s aim as enabling consistent, automated testing.
The school itself has been established in 2024 through a ₹110‑crore endowment from Sunil Wadhwani, co‑founder of IGATE and Mastech Digital, underscoring a broader institutional focus on responsible AI.


