How Are Indian Firms Training LLMs? The Challenges, Subsidies, and Breakthroughs

At the AI Impact Summit, the Bengaluru-based startup Sarvam AI released two Large Language Models, which are the foundation for AI systems that power services like Google’s Gemini and OpenAI’s ChatGPT. The two models were trained on 35 billion and 105 billion parameters respectively, and were less powerful and compute-intensive than comparable models, while demonstrating improvements over other models in Indian languages, according to Pratyush Kumar, a Sarvam co-founder.

This announcement represents a significant milestone in India’s quest to develop indigenous AI capabilities. But it also raises important questions: Why is training an LLM on Indian soil with Indian capital a challenge? How has the IndiaAI Mission subsidised these efforts? And what technical innovations make models like Sarvam’s possible?

The Technical Challenge of Training LLMs

LLMs are trained and operated on clusters of Graphics Processing Units. The combined cost of the GPUs and the electricity needed to run them long enough to train a model runs into millions of dollars. This is not a one-time expense; training requires running these clusters continuously for weeks or months.

The first step is data collection, where English, European languages and East Asian languages like Korean and Japanese are more richly represented than Indian languages. This creates a twofold challenge for training an LLM on Indian soil with Indian capital.

For one thing, with scarce data sources, many LLMs either perform worse when operating on Indian languages, or burn more “tokens” on inference to translate sentences into English (and translating responses back) to perform better. Since machine translation has improved dramatically for Indian languages, this remains the gold standard for many LLMs.

Secondly, since capital is also scarce, efforts to train an LLM by Indian firms targeting Indian users can be challenging, especially if there is no immediate business use case for doing so.

The Translation Workaround

Using translations as a fulcrum can be a challenge for developers who want to leverage local LLMs—like Sarvam’s 35 billion parameter model, which was shown off in a demo during the summit’s research symposium working on a feature phone—where suboptimal performance in Indian languages can impact adoption and quality of performance.

The translation approach has its limits. When a user speaks in Tamil, the system must translate to English, process, and translate back. Each step introduces potential errors and latency. For a feature phone with limited processing power, this can be particularly challenging.

Government Support: The IndiaAI Mission

The IndiaAI Mission has subsidised efforts to conduct training in India, by commissioning over 36,000 GPUs in data centers operated by Indian firms like Yotta, and allowing researchers and startups to run training and inference workloads at a relatively nominal fee.

The government gave Sarvam access to 4,096 GPUs from its common compute cluster, and the subsidy so far is estimated at almost ₹100 crore. The “bill of materials” for this cluster is ₹246 crore, though these GPUs can probably be continued to be used by others.

This is not a grant; it is a strategic investment in building domestic AI capability. The government is effectively underwriting the compute costs that would otherwise be prohibitive for Indian startups.

The Strategic Rationale

The Ministry of Electronics and Information Technology has encouraged domestic LLM development for many reasons. The main one is a belief that foreign-developed LLMs can’t possibly find the capabilities or the business case to develop the capacity to work well with Indian languages.

This is a matter of both functionality and sovereignty. If Indian users must rely on foreign models for AI services, they are subject to foreign priorities, foreign content policies, and foreign business models. Domestic LLMs can be tailored to Indian needs and governed by Indian law.

Additionally, encouraging talent that can train LLMs has been seen as important to foster the Indian AI ecosystem. Training an LLM requires a rare combination of skills—deep learning expertise, systems engineering, data curation—that can only be developed through hands-on experience.

Sarvam’s Achievement

As such, Sarvam’s announcement of its two models is a significant development in India’s own quest to develop a powerful and relatively inexpensive LLM. When China’s DeepSeek developed its R1 model, it demonstrated that efficient architectures could compete with much larger models. Sarvam is following a similar path.

The 35 billion and 105 billion parameter models are not the largest in the world—some models exceed a trillion parameters—but they are optimised for efficiency and for Indian languages. This is not about competing with OpenAI on general benchmarks; it is about serving Indian users effectively.

The Broader Ecosystem

Sarvam is not alone. Other Indian startups and research groups are also working on LLMs. The IndiaAI Mission’s GPU cluster is a shared resource that can support multiple efforts. Over time, this could create a virtuous cycle: more models, more talent, more applications, more users.

The challenge will be sustainability. Training a model is expensive, but operating it at scale is also expensive. Inference costs—the cost of running the model to serve users—can add up quickly. Sarvam and others will need to find business models that make this work.

Conclusion: A Promising Start

India’s journey to develop indigenous LLMs is still in its early stages. Sarvam’s models are a promising start, but they are not the finish line. The real test will come when these models are deployed at scale, serving millions of users across India’s diverse linguistic landscape.

The IndiaAI Mission’s support is crucial, but it is not a substitute for a viable commercial ecosystem. Ultimately, Indian LLMs must prove their value in the market. If they can, India will have taken a significant step toward AI sovereignty.

Q&A: Unpacking Indian LLM Development

Q1: What are the main challenges in training LLMs in India?

Twofold challenge: data scarcity and capital scarcity. Indian languages are less richly represented in training data than English, European, and East Asian languages. This forces reliance on translation workarounds that can impact performance. Additionally, training requires millions of dollars in GPU compute and electricity, which is prohibitive without subsidies or clear business cases.

Q2: How has the IndiaAI Mission supported domestic LLM development?

The Mission has commissioned over 36,000 GPUs in data centers operated by Indian firms, allowing researchers and startups to run training workloads at nominal fees. Sarvam received access to 4,096 GPUs from this cluster, with an estimated subsidy of nearly ₹100 crore out of the ₹246 crore total cluster cost. The government’s strategic rationale is that foreign models won’t adequately serve Indian languages.

Q3: What technical approach do Indian models like Sarvam’s take?

Sarvam’s models use 35 billion and 105 billion parameters—smaller than frontier models but optimised for efficiency and Indian languages. Rather than competing on raw scale, they focus on cost-effectiveness and performance in Indian language tasks. The models are designed to run on less powerful hardware, including a demo on a feature phone.

Q4: Why can’t India simply rely on foreign LLMs like ChatGPT?

Foreign models are not optimised for Indian languages, often relying on translation workarounds that introduce errors and latency. There are also sovereignty concerns: reliance on foreign models means subjection to foreign priorities, content policies, and business models. Domestic LLMs can be tailored to Indian needs and governed by Indian law.

Q5: What are the next steps for India’s LLM ecosystem?

The immediate challenge is sustainability—training is expensive, but operating at scale is also expensive. Sarvam and others must find viable business models. Over time, the GPU cluster can support multiple efforts, creating a virtuous cycle of more models, more talent, more applications, and more users. The real test will come with deployment at scale across India’s diverse linguistic landscape.

Your compare list

Compare
REMOVE ALL
COMPARE
0

Student Apply form