What is BharatGen? Inside India’s multilingual AI model for 22 languages
BharatGen is India’s first government-backed multimodal AI model. Here's how it works.
In the last few months, India has been building BharatGen, its first government-backed multimodal large language model (LLM) initiative.
Designed to advance AI in Indian languages, BharatGen aims to provide indigenous tools, platforms, and models trained on India-centric datasets to support a wide range of applications.
BharatGen is being developed under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) of the Department of Science and Technology (DST).
The project is led by the Technology Innovation Hub Foundation for IoT and IoE at IIT Bombay, in collaboration with several premier institutes, including IIIT Hyderabad, IIT Madras, IIT Kanpur, IIT Hyderabad, IIT Mandi, and IIM Indore.
The initiative seeks to create open-source, ethical, and inclusive AI resources for government, research, and private sector use. By building AI models with Indian data, BharatGen aims to preserve linguistic and cultural diversity while ensuring accessibility across the country.
Multimodal and multilingual
BharatGen is not limited to text-based capabilities. It is being developed as a multimodal model, integrating text, speech, and vision. This enables it to support a wide range of use cases, from natural language processing to image recognition and voice-based applications.
BharatGen supports nine Indian languages — Hindi, Marathi, Tamil, Malayalam, Bengali, Punjabi, Gujarati, Telugu and Kannada. Assamese will be the 10th language on BharatGen.
Pilot projects based on BharatGen are being explored in key sectors such as agriculture, governance, and defence. The model’s capabilities are expected to enable farmers to access information in local languages, improve administrative communication tools, and support applications in national security.
In addition to practical deployment, BharatGen aims to contribute to the broader research ecosystem by offering open-source platforms and datasets. This will allow universities, startups, and developers to build upon the foundation provided by the project.
By creating a homegrown large language model, India is seeking technological sovereignty in AI. BharatGen positions the country to compete in the global generative AI landscape, while also prioritising inclusivity for its diverse linguistic population.


