The Elephant in the Data Room, Why India’s Fragmented Information System Is Costing Billions

As another session of Parliament has ended, a familiar pattern was visible on the floor of the House. Members of Parliament who rose to ask questions were performing one of Parliament’s most important accountability functions. Yet, a large share of these questions followed a predictable format, such as asking how many schools have functional toilets, how many pensions were disbursed in a given year, or how many beneficiaries received a particular scheme. While these questions address important public concerns, the information they seek should ideally already exist in the public domain in a clear, standardised, and easily accessible format.

An analysis of the parliamentary questions asked during the 17th Lok Sabha (2019-24) on youth employment found that a large share sought such basic facts. This reflects a far deeper reality: India’s data system is fragmented and lacks interoperability. The elephant in the room, rarely acknowledged in such debates, is data standardisation, without which even the most ambitious policy visions risk being built on shifting sands. This article examines the anatomy of India’s data fragmentation problem, its fiscal and economic costs, its impact on policy-making and global rankings, and the reforms needed to make India’s data fit for purpose.


Part I: The Anatomy of the Problem – Incoherence and Inconsistency

In the National Data and Analytics Platform vision document released by NITI Aayog, it was observed that India’s data ecosystem remains incoherent, with ministries and government departments failing to use shared standards for common indicators and even defining basic attributes such as time period and region inconsistently. India today generates more data than ever before, yet abundance does not equate to usability. Data collected by individual ministries for their own programmes often cannot be integrated seamlessly, making consolidation a laborious and error-prone task.

The problem is not that data does not exist. The problem is that data exists in silos, in different formats, using different definitions, covering different time periods, and lacking common identifiers that would allow them to be linked. A beneficiary’s name in one database may be spelled differently in another. A district’s boundary may be defined differently by different ministries. A time period labelled “2024-25” may mean April-March for one ministry and January-December for another.

This fragmentation is not a technical nuisance; it is a governance failure with real consequences.


Part II: The Fiscal Cost – Billions Lost to Duplication and Leakage

According to a NITI Aayog report released in June 2025, welfare programme databases often list the same beneficiary multiple times, leading to fiscal leakages that inflate spending by 4 per cent to 7 per cent annually. This is not a small inefficiency. It is a drain on the exchequer that could otherwise fund schools, hospitals, roads, or tax cuts.

Recent government data clean-ups highlight the potential savings from addressing such inefficiencies:

  • Pradhan Mantri Kisan Samman Nidhi (PM-KISAN) scheme: Deleting 17.1 million ineligible names from the scheme was expected to save ₹90 billion in FY2024. This is money that was being sent to people who should not have been on the rolls.

  • LPG connections: Removing 35 million bogus LPG connections could save ₹210 billion over two years. These connections were either duplicate, inactive, or held by ineligible households.

  • Ration cards: Eliminating 16 million fake ration cards may save around ₹100 billion annually. These cards were being used to divert subsidised food grains to the black market or to ineligible beneficiaries.

The cumulative potential savings run into hundreds of billions of rupees annually. This is not a rounding error. It is a significant share of the government’s welfare budget. The money being wasted on duplicate and ineligible beneficiaries could, if properly directed, fund nutrition programmes for millions of children, infrastructure upgrades in rural schools, or health insurance for uninsured families.


Part III: The Policy Cost – Conflicting Estimates and Paralysed Decision-Making

Beyond the direct fiscal cost, data fragmentation has a subtler but equally damaging effect on policy-making. Consider the health sector. Studies show that childhood tuberculosis cases are recorded separately in the Health Management Information System (HMIS), the disease surveillance network, and immunisation registries. The same patient is often counted multiple times, creating conflicting estimates of disease burden.

When decision-makers receive three different numbers for the same indicator, they face a choice. They can try to reconcile the estimates, a laborious and uncertain process. They can choose one source and ignore the others, risking bias. Or they can disregard data altogether and rely on anecdote, intuition, or political expediency. Too often, the last option prevails.

Data that cannot be trusted is data that cannot be used. And data that cannot be used leads to policies that are not evidence-based. India’s ambitious policy agenda—from health reform to education to infrastructure—requires reliable, timely, and consistent data to track progress, identify gaps, and allocate resources. Fragmented data undermines all of these functions.


Part IV: The Perception Cost – Missing Data in Global Indices

Beyond domestic policy, India’s data fragmentation also carries perception and economic costs on the global stage. In the Global Innovation Index 2024, India had missing data for two indicators and outdated data for eight, with several relying on figures more than a year old. India’s performance on these indices matters for foreign investment, benchmarking, and national pride. When data is missing or outdated, India is penalised regardless of its actual performance.

Without coordinated methodologies, such indices both mask real performance and expose gaps in inter-agency coordination. A country that cannot produce timely, standardised data for its own use cannot expect to be assessed fairly by global indices.

In economic terms, the Organisation for Economic Co-operation and Development (OECD) estimates that improving public-sector data availability and sharing could add up to 1.5 per cent of GDP, rising to 2.5 per cent if private-sector data is included. In other words, the cost of poor data governance lies not only in misinformed decisions and fiscal leakages but also in squandered economic potential. Better data means better policy, better policy means faster growth, and faster growth means higher living standards.


Part V: The Solution – The India Data Management Office (IDMO)

The solution to these inefficiencies can be seen under the National Data Governance Framework Policy (NDGFF) , where the proposed India Data Management Office (IDMO) has the potential to be the keystone of reform. The IDMO would develop and enforce common rules, standards, guidelines, and protocols for data across all ministries and states.

However, for the IDMO to succeed, it needs to be empowered with real authority—to set binding standards, audit compliance, and resolve disputes over definitions and methodologies across ministries. Without enforcement power, the IDMO will be another coordination body with good intentions and no teeth. Ministries will continue to use their own definitions, their own formats, and their own timelines. The fragmentation will persist.

In addition, alignment with global statistical frameworks such as the UN’s System of National Accounts (SNA) for economic indicators, and harmonising them within a National Statistical Standards Manual, could unify definitions and practices nationwide. This manual would be the authoritative source for how indicators are defined, how time periods are specified, how geographic boundaries are set, and how data should be formatted for interoperability.

Most of all, India’s open data platform, data.gov.in, should be scaled up into a centralised, schema-consistent repository that serves both public availability of information and internal government needs. Ministries must upload datasets in standardised formats regularly, enabling parliamentarians to access real-time, district-level figures. A parliamentarian should not need to file a question to know how many schools in their constituency have functional toilets. That information should be available on a dashboard, updated in real time, and verified by independent audits.


Part VI: Institutionalising Accountability – The Data Governance Quality Index

Finally, institutionalising accountability will be key to sustaining progress. NITI Aayog’s Data Governance Quality Index should be an annual benchmark, tied to performance reviews and incentives for ministries and states. Healthy competition on data quality can drive change as powerfully as economic competition.

Ministries that score well on data quality should be recognised and rewarded. Ministries that score poorly should face consequences—in budget allocations, in performance ratings, in public scrutiny. The index should be publicly available, allowing civil society, researchers, and journalists to hold data producers accountable.

Data standardisation is often minimised as a technical exercise, a matter for IT professionals and statisticians. But it is in fact the grammar of governance. Without a common grammar, sentences cannot be understood. Without standardised data, policy cannot be evidence-based.


Conclusion: Fit for Purpose, Fit for the Future

India has made remarkable progress in data collection and digital infrastructure. The India Stack, Aadhaar, the Unified Payments Interface (UPI), and other digital public goods are world-leading. But data collection is not the same as data usability. Abundance without standardisation is chaos.

A nation aspiring to become a $5 trillion economy—or larger—needs data that is fit for purpose. It needs data that can be integrated across ministries, linked across schemes, and analysed in real time. It needs data that can be trusted by policy-makers, parliamentarians, and the public.

Addressing the elephant in the data room means committing to the standards, systems, and stewardship that will make India’s data fit for purpose and fit for the future. The savings are in the billions. The economic gains are in the percentage points of GDP. And the policy gains—better decisions, better outcomes, better lives—are priceless.

5 Questions & Answers Based on the Article

Q1. What is the “elephant in the room” that the article identifies as the root cause of India’s data governance problems?

A1. The “elephant in the room” is data standardisation—the lack of common rules, definitions, formats, and protocols for data across different ministries and government departments. India generates more data than ever before, but this data is fragmented, lacks interoperability, and cannot be integrated seamlessly. Different ministries define basic attributes such as time periods and geographic boundaries inconsistently, making consolidation a laborious and error-prone task. The article argues that without data standardisation, even the most ambitious policy visions risk being built on “shifting sands.”

Q2. How much does data fragmentation and duplication cost the government annually, and what specific examples are cited?

A2. According to a NITI Aayog report from June 2025, welfare programme databases listing the same beneficiary multiple times lead to fiscal leakages that inflate spending by 4 to 7 per cent annually. Specific examples include: deleting 17.1 million ineligible names from the PM-KISAN scheme expected to save ₹90 billion in FY2024; removing 35 million bogus LPG connections expected to save ₹210 billion over two years; and eliminating 16 million fake ration cards saving around ₹100 billion annually. The cumulative potential savings run into hundreds of billions of rupees annually.

Q3. How does data fragmentation affect policy-making in the health sector according to the article?

A3. The article cites the example of childhood tuberculosis cases, which are recorded separately in the Health Management Information System (HMIS) , the disease surveillance network, and immunisation registries. The same patient is often counted multiple times, creating conflicting estimates of disease burden. When decision-makers receive different numbers for the same indicator, they face a choice: reconcile the estimates (laborious and uncertain), choose one source arbitrarily (risking bias), or disregard data altogether and rely on anecdote or political expediency. Too often, the last option prevails. Data that cannot be trusted is data that cannot be used, leading to policies that are not evidence-based.

Q4. What is the proposed India Data Management Office (IDMO), and what conditions are necessary for its success?

A4. The IDMO is proposed under the National Data Governance Framework Policy (NDGFF) . It has the potential to be the keystone of reform by developing and enforcing common rules, standards, guidelines, and protocols for data across all ministries and states. However, for it to succeed, the article argues that it needs to be empowered with real authority—to set binding standards, audit compliance, and resolve disputes over definitions and methodologies across ministries. Without enforcement power, the IDMO would be another coordination body with good intentions and no teeth, and fragmentation will persist.

Q5. What role does NITI Aayog’s Data Governance Quality Index play in the proposed reform framework?

A5. The Data Governance Quality Index should be an annual benchmark, tied to performance reviews and incentives for ministries and states. Healthy competition on data quality can drive change as powerfully as economic competition. Ministries that score well should be recognised and rewarded, while those that score poorly should face consequences (in budget allocations, performance ratings, and public scrutiny). The index should be publicly available, allowing civil society, researchers, and journalists to hold data producers accountable. The article emphasises that institutionalising accountability is key to sustaining progress.

Your compare list

Compare
REMOVE ALL
COMPARE
0

Student Apply form