When I was in school, the word scientist reminded me of people doing hi-tech science projects or people associated with space technology. There were times when you don’t get scientist in job marketplace, who are actively looking for jobs. Well, it was not because there were no lucrative jobs, it’s just that they believed doing something out of box in whatever they do. Scientists, in general, perform investigative activities that are either revolutionary in terms of technology, innovative ideation, research and development etc. Apparently, dictionary definition of Scientist is someone who is trained in a science and whose job involves doing scientific research or solving scientific problems.
Believe me, “Everything is science”, be it technological, philosophical, political, medical, art, abstract and even Data! Why has it become so important to talk about Data Science and above all talk about a Data Scientist?
Well, after having worked in the Data Science field for more than 3 years now (and not claiming that I am an expert!), I have some observations and learnings that I feel I can share, that will help aspiring Data Science professionals build right skills. Firstly, Analytics is not about Tools. Professional who talk about their knowhow on tools actually don’t know anything about Analytics.
Secondly, analytics is all about addressing business issues and not data related issues. Thirdly, people who have a knack for numbers and have the ability to relate any business issues with numbers (business acumen) make good data science professionals. Apparently, there are majority people who claim themselves as Data Scientist – even the 5 years experienced and the 15 ones too. So who is really the Scientist? Rather, what is the definition of scientist in the Data realm? What should you expect from a Data Scientist?
As a Business Development Manager, I face challenges while prospecting for BIG Data or Data Analytics Projects. As a sales person, I expect my analytics delivery team to answer the questions that come from different aspects of a Data Science project, be it understanding the business problem, project execution strategy, envisaging the larger picture in business context and much more. Based on my experiences, there are some skills that I reckon every Data Scientist must exhibit.
‘A for Analytics’
Data Analytics or BIG data is not only about Statistics, Tools, Techniques and Visualization. Perhaps it is more than that. As a Data Scientist, the organization (team that you are working with) expects you to set the right context of the Analytics problem at hand. You need to provide the inference of the situation as well as what the probable outlook would be. Apart from studying the data and identifying what will be the right tools and statistical techniques, one needs to list down the key questions that will or needs to be answered with the available data.
‘B for Business’
Of all the Data Scientist that I have come across, majority of them lack the basic understanding of business as in “What is the business outcome of the project?” or “How will it benefit the customer?” Even the project that they are working on could be a small pie of a larger project, it requires good understanding of customers business to go and tell them a story of “How can you make it BIG” – either by creating a Data Analytics roadmap for customer or by showing them analytics benefits beyond the scope of the project. Well, this doesn’t come overnight. Apparently, it comes with experience, especially, having worked with customer (in client facing roles) or at customer locations or being a part of their Data Analytics team (because that is where the roadmaps are defined).
‘C for Coding’
One may argue that being a Data Scientist, I am not supposed to sit and write lines of code. Well, I don’t say that Data scientists are expected to code all the time. What I mean is they should be hands on in scripting on any tool. Whenever, the analytics team is struck in a technical issue, the data scientist is expected to jump in and assist the team. Not all employers may ask you in the interview that “Whether or not you know coding”, but it is expected and tacit.
‘D for Domain’
Having a knowledge about all the domains is next to impossible and don’t worry, as a Data Scientist you are not expected to know all. However, what you are expected to know if the working knowledge or a very high level overview of how a particular industry / process / operations works. For e.g. As a data scientist in a CPG company, one is not expected to know the chemical composition of the ingredients and process knowledge, but is expected to know the overview, high level working, understand and be able to identify the Key Performance Indicators (KPIs) for that particular process / industry. Similar is the case with an Automotive manufacturing plant / process, it could be Transportation, Banking, Insurance, Retail etc. Domain information can be gathered by discussing with the technical teams responsible for a particular activity (process, plants, manufacturing units etc.) and also by doing a lot of reading.
‘E for Explain or Express’
The great scientist Albert Einstein once quoted ‘Imagination is more important than knowledge’. Consider this situation, as a Data Scientist, you are unable to put you views across to your customer in terms of what data you would need in order to address a particular problem. If you are not able to articulate your case effectively to the client, you are heading to a deeper mess! Data Scientists must be able to effectively communicate in order to explain the stakeholders what does their Analytics mean? What they need and how they can create a difference?
It is often said that Data speaks or a good analytics outcome speaks for itself, but that’s for another day when you successfully achieve the outcome. In order to arrive to a successful outcome and create a business impact with your analytics, you as a Data Science processional will need to address the above discussed areas that will not only make you more acceptable and employable, but will truly help you brand as “Data Scientist”.
Try deep learning using MATLAB