To enable easy access to this data, Google this week launched Dataset Search, for scientists, data journalists, data geeks, or anyone else can who wants to find the data required for their work and their stories. Natasha Noy, Research Scientist at Google AI wrote on their official blog that the Dataset Search works similar to Google Scholar.
“Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages,” Noy wrote.
Google’s guidelines include salient information about datasets like:
- Who created the dataset
- When it was published
- How the data was collected
- What the terms are for using the data
Google then collects and links this information, analyses where different versions of the same dataset might be, and finds publications that may be describing or discussing the dataset.
“Our approach is based on an open standard for describing this information and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem,” added Noy.
Ed Kearns, chief data officer at National Oceanic and Atmospheric Administration, US, is a strong supporter of the Google Dataset Search project and helped NOAA make many of their datasets searchable in this tool. “This type of search has long been the dream for many researchers in the open data and science communities… And for NOAA, whose mission includes the sharing of our data with others, this tool is key to making our data more accessible to an even wider community of users,” he said.