When last month, former Chief Data Scientist (CDS) of the federal U.S. government, DJ Patil landed in India for roadshows in three major cities, the local data science industry attended with massive enthusiasm. Understandably so, because Patil was the first person to be appointed to the coveted post in 2015. And the position has been vacant since he left office early this year.
In a bid to democratise its data, U.S. government made more than 138,000 data sets available to the public. It was a bold move, not just to expand its capability to process huge datasets but also use the insights they hold to build innovative data products.
That’s where U.S. CDS comes in picture. In 2015, Barack Obama appointed Patil to the newly created role of Deputy Chief Technology Officer for Data Policy at the White House Office of Science and Technology Policy.
“My role as the U.S. CDS will be to responsibly source, process, and leverage data in a timely fashion to enable transparency, provide security, and foster innovation, in order to maximize the return on the investment in data,” Patil had said early last year after assuming office.
Given the magnanimity of leading data science initiative for the U.S. government, one would wonder what exactly were the job requirements of this role and how can this go vacant for so long.
“Given the importance, this Administration has placed on data, along with the momentum that has been created, now is a unique time to establish a legacy of data supporting the public good. That is why, after a long time in the private sector, I am returning to the federal government as the Deputy Chief Technology Officer for Data Policy and Chief Data Scientist,” Patil said.
So Why A New Job Title?
Using statistics for economic and monetary policies and insights in nothing new for the U.S. government. For that matter, most governments today apply some level of data mining techniques on its census and monetary data. The U.S. government already has positions like that of a chief economist and chief statistician. However, the role of the first-ever U.S. CDS, is varied and much more detailed simply because the way data is dealt with has changed over the years, thereby increasing the demand for data scientists.
The vast majority of existing data has been generated in the past few years, and today’s explosive pace of data growth is set to continue. In this setting, data science — the ability to extract knowledge and insights from large and complex data sets — is fundamentally important. If we think of governments as organisations, they generate more data than most large companies do today.
Speaking at 2016 recode conference, DJ Patil mentioned that president Obama got excited about data particularly when he saw technology and data play a transformative role on how to reach out to the electorate and interact with them.
Traditionally government organisations operate in silos (just like any large organisation) without much interplay of data among them. Obama probably had a realisation of this fact and thus the new job role was created.
What The Role Really Comprises Of:
The Obama administration embraced the use of data to improve the operation of the U.S. government and the interactions that people have with it. Thus, creating the need for a CDS just to unlock the data. In other words, making all the data from various departments of U.S. government accessible.
Second, the CDS had to make that data usable and discoverable so that more people can use it. As Easy as it may sound, but not all data within organisations are in their most usable form. Some might be in well formatted relational databases, while other might just be floating in someone’s hard-drive in a PDF format.
Third, the need to create value out of that data. One such initiative was Obama government’s Precision Medicine Initiative.
Lastly, to work carefully and thoughtfully to ensure data science policy protects privacy and considers societal, ethical, and moral consequences.
Why Hasn’t The New Government Filled In The Position?
As of June 6, President Donald Trump had announced a nominee for just seven or 15% of the top 46 science posts in the federal government that require Senate confirmation. The massive failure of the Trump administration to fill top science posts came into light when he said the Paris accord was a “bad deal”.
It is hard to guess who Trump consults for advice on prime issues like climate change, environmental quality or even managing all the data of the federal government. It would be a surprise to not see the Trump administration fill in the position of the CDS soon enough for it to unravel the treasure of data the government holds.
One needs to understand that while there is a rich history of companies using data to their competitive advantage, the disproportionate beneficiaries of big data and data science have been internet technologies like social media, search, and e-commerce. While innovative data science applications can transform the future, the lack of it might just stall growth and even deprive people of a rapidly evolving world of internet.
Try deep learning using MATLAB