Dataset Search - Google's new data engineLaxmi Sharma
Google launches a new search engine called Dataset Search to help researchers access freely available online data.
To facilitate access to data, Google has launched a new search engine name Dataset Search. The reason behind launching this new search engine is to manage massive amounts of data that exist already on the web. The main problem with this larger-scale data is the way in which it is available. i.e. in an un-ordered way. These data is not always served in an easy-to-analyse format for search engines.
Metatags to Facilitate Research
The major thought behind this is to make all the information easily accessible to scientists, journalists, and other data-hungry users. Whether it is to satisfy one's intellectual curiosity or to exploit this data for work, Dataset Search must become the reference tool in this field.
Further, Google provides guidance to dataset providers to systematically describe their data so that it can be easily identifiable by Dataset search engine. It is therefore recommended to include metadata tags in the web pages that describe the data, with the information of those who created it, when they were published, and how the data were collected. This information will then be indexed by Google search engine and combined with information from the knowledge graph.
Use of the schema.org Standard
Google’s approach is based on the open standard schema.org to describe this information. Google encourages using this system to describe databases, as they says in a blog post:
“Anyone who publishes data can describe their dataset in this way. We encourage data set providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.”
Google initially launches data search with content from NASA, the National Oceanic and Atmospheric Administration (NOAA), Harvard Dataverse, and the Inter-University Consortium for Political and Social Research (ICPSR), among other academic collections.
Dataset Search becomes Google’s new search engine specializing in a domain, joining, among others, Google Scholar, Google Books and Google Patents.