Standardisation and collaboration are vital to transform India’s open data landscape
The conversation offers a deep dive into open data in India, with a special focus on new emerging solutions like NDAP that are unlocking the potential for improved policy making and innovation based on interoperable open datasets.
As part of the decODE Conversations series, hosted by Omidyar Network India and The Quantum Hub, Dr Sam Asher, Co-founder Development Data Lab (DDL), and Anna Roy, Senior Adviser, NITI Aayog, talk to Varad Pande about the crucial role of open and interoperable public data in solving today’s population-scale challenges.
“We live in an era where data plays a significant role in shaping our economy as well as society. For years now, we have seen the private sector using data to innovate new products and services,” said Varad Pande.
He was initiating the eighth round of decODE Conversations on ‘Unlocking the power of 'Open Data'’ with Dr Sam Asher, Co-founder, Development Data Lab, and Anna Roy, Senior Adviser, NITI Aayog.
decODE Conversations, which is a series of in-depth discussions with experts working with and on Open Digital Ecosystems (ODEs), aims to spotlight innovations in the ODE space, while learning more about the opportunities and challenges that lie ahead.
The conversation offers a deep dive into open data in India, with a special focus on new emerging solutions like NDAP that are unlocking the potential for improved policy making and innovation based on interoperable open datasets.
Talking about India’s public data sector, Pande said admirable efforts have been made over the last decade to bolster accessibility of open public data, through both platforms and policies; “The good news is that over the past few years, there has been a movement, steering our statistical infrastructure towards embracing openness, where anyone can access, use, reuse, and redistribute data, to help governments, the private sector, researchers, media, and people at large tap into its potential to generate new insights, improve products, service delivery, and drive greater accountability,” he said.
Open public data in India: The current scenario
Asher said when he started his PhD in 2007 it was “a nightmare to get actual raw data”. “But now, the government of India has made huge progress on these fronts and made large amounts of data publicly available. The Census Digital Library is a great example of that,” he said.
A lot more progress can be made, he said, especially in terms of improving the granularity, timeliness, accuracy, and dissemination of public data.
A platform like NDAP that can reach every single citizen in India can make public data more accessible. People can easily use the data for various purposes, be it policy research and decision making, entrepreneurship, accountability, or just general information. This step can benefit people across different domains of society.
Solving India’s data-related challenges
Giving a bird’s eye view of India’s open data ecosystem, Roy said, “Firstly, data is not a homogenous thing. Data comes in different forms and shapes, and there is a user journey involved in each data segment.”
Each of these data segments has its challenges and needs targeted approaches.
Roy said India has a very robust statistical infrastructure and it is inspiring to see how we collect data through the census or surveys, given the country’s diversity, size, and scale. “Over time, technology has provided us with additional tools that can be leveraged to ease the drudgery of data collection, storage, and analysis.”
She added that technologies emerging in IR 4.0 (Fourth Industrial Revolution) are presenting immense opportunities. For instance, now technologies present several options for efficiently managing each phase of the user journey, from collection and processing to hosting, sharing, and analysis – which was not available earlier.
Standardisation helps make better sense of data
Roy believes that standardisation is vital, adding that we “cannot have a mother of all platforms because of differences in datasets”.
“But I think standardisation is the key, which makes data more intelligent. Standardisation can help us make more sense out of data, keeping last-mile access issues in view,” she said.
Talking about India’s efforts in terms of standardisation, Roy said, “For instance, we at NITI Aayog ideated a unified platform for the logistics sector of India that can capture, store, and provide data from ports, highways, and individual cargos among other stakeholders…we incubated the idea of Unified Logistics Interface Platform (ULIP), a replica of India Stack for the logistics sector.”
She said the team behind this platform was at present working on standardisation and building a layer over it for data exchange. The outcome of these efforts is “a published data set that gets aggregated at different levels, adhering to all set protocols”.
Adding to the point of collaboration and standardisation, Asher said, “In some ways, India should aspire to be at the level of open data we see in the US and the UK, but in one particular domain India’s progress is unprecedented: stitching all of the data together [through NDAP]. And we, at Development Data Lab, are really proud to be a part of this process.”
As part of the process to stitch data points together, the team ended up making all datasets “speak to each other - at the village, town, sub-district, and district level”. “[This] doesn't exist anywhere else in the world. No one else has gone through that final-mile effort to make everything speak to each other,” Asher said.
Addressing how NDAP enables discovery of data and how interoperability makes it easier for not only researchers but also for public servants at various stages to explore new data points and make data-backed decision, Asher said that a lot of decision-makers are looking to do developmental work in rural India but don’t know where to look for data-backed information to make an impactful decision.
“For instance, if a well-intentioned bureaucrat wants to build a primary health centre in one of the villages in India, they don’t know where it is needed the most. They need data points to answer questions like which villages have existing health centres, where is the private sector filling in for the public sector, where there are proper roads, whether the village has access to electricity or not. This is where NDAP can help with multiple data points,” added Asher.
NDAP: India’s landmark solution for public data challenges
As a part of India’s goals under the G20 Presidency, the country has highlighted its clearer focus on digital public goods and digital public infrastructure, indicating the importance of using public data for both societal and economic development.
While talking about India’s step towards becoming the voice of the Global South and its technological solutions to solve population-scale challenges, Roy said, “One cell or department cannot achieve this. That's why we have a multi-layered, institutional structure in place. So, we had a multifaceted group of technologists, and data researchers, who signed off on different developments on the National Data and Analytics Platform (NDAP), which facilitates and improves access to Indian government data.”
She said the important thing about NDAP and digital public goods is “interoperability, collaboration, and user centricity – and that is the message which we take back to the rest of the world”.
“At NDAP, we have an NLP-based search engine. We want it to work as ‘Google’ does it for general search. On NDAP, users can conduct keyword searches. It allows discovery of data in a user-friendly way,” Roy said. She added that decision makers across all leadership levels can use the platform to gather data-backed insights, necessary for making crucial decisions. Collaboration is the key as are breaking down silos and a focus on user-centricity.
“Last but not least, is procurement. NDAP is based on an outcome-oriented procurement mechanism, built on things that were already there,” Roy said.
With NDAP, previously collected datasets have been made compliant to a common data schema - so that datasets from different departments can be easily merged for analysis – in a faster and more efficient manner. Roy said NDAP is demand driven - it is modifying and adding features based on user surveys and feedback, to make the platform more and more user-friendly as it evolves.
Pande highlighted the fact that human capacity continues to be essential especially at the last mile, to ensure better quality of data being collected. He added that building this capacity would be the important next step to further improve India’s open data landscape.
Note: Varad Pande was Partner, Omidyar Network India, at the time of hosting this decODE conversation. He is now Partner & Director, Boston Consulting Group.