Data scientist jobs has been ruling the employment popularity charts for the longest time, and with many renowned reports suggesting it to be sexiest and the most paid jobs of 21st century, it has become the sought after job profiles that candidates are willing to be placed into. However, it may not be always the case that the most deserving candidate grabs the spot. Given the newness of the role and lack of understanding of the exact job description, there are people who tend to apply for the role, purely because they deal with data. That however is not true of a data scientist’s capabilities!
As we all know by now that data scientist is not just someone who knows data well, but there are many more other skillsets and mindset that makes it into the must-have list of data scientist’s resume. Since there is no regulatory or licensing body overseeing the hiring process for data scientists, it becomes difficult to scrutinise the right kind of candidate, and there may be many fakes ones applying for the job too.
The challenge here is how to spot a real data scientist amidst the resumes flowing from both genuine and pseudo genuine candidates. With this post, let’s try to figure out how can a candidate’s profile sums up in all aspects to be that of a real data scientist.
Signs of a real data scientist
Pro in most quantitative skills such as statistical algorithms and other tools with a high qualification degree- This goes without saying that most of the real data scientist that make a way into data science roles in various companies come with a highly advanced and technical degree. Most of them have at least a master’s degree and a PhD. Their area of study may be different, but most of them include math, statistics, computer science, economics etc. as their area of expertise. They have a sound technical knowledge of languages like R and the use of algorithms to collect and analyse millions of data. Machine learning becomes one of their core understandings of the subject as these algorithms have the capability to get more and more accurate over time to deliver better results.
Has specific use cases in terms of tools and models used- Apart from having a degree in data science, actual data scientists have an inside out knowledge of tools like SAS, Python, or Hadoop. Unlike fake candidates who just list these tools, real ones can support having used these tools with projects evidence. It is important that data scientist gets their experience straight with working on unstructured data with perfect and genuine examples. If their involvement with a project sounds vague, they probably aren’t good data scientist. If a person merely has a experience in terms of organising and structuring large data sets, and no experience with analytical or statistical concepts, the professional may be a data engineer or a data analyst but not data scientist.
High in business acumen- Those coming from a purely academic or research background may not be the ones those who fit the bill of a corporate environment perfectly. To be an effective data scientist it is important that the candidate pose necessary business acumen to understand the findings and use it in a way that impacts business goals. A true data scientist also knows how to deliver actionable insights to business leaders in an easy to understand manner. A key component to being a data scientist in a corporate setting is thus, business acumen.
They have a lot of other data scientist in their network- Lastly, not one of the must have criteria to spot a real data scientist, it is usually handy to spot the real ones by checking the LinkedIn profiles. As the case is in the usual scenario, one’s professional network is filled with other individuals within the same field, the same applies for data scientists. If a candidate is genuine, he/she might have an interest in knowing others within the same industry, and hence may boast a lot of genuine data scientists in their network.
Few questions to spot a fake data scientist
As we have read till now that data science is not just an expertise in one particular field, but a knowledge of the application of full stack of scientific tools and techniques such as mathematics, computational, analytical, visual, statistical, business acumen etc., it becomes handy in identifying the right candidate by asking questions that digs out the right and requisite skills from a candidate. Those posing themselves as data scientist may find it difficult to cover all these aspects as they insist that their discipline is the one defining true data science—which may not be the case. Some of the questions that can help in spotting the real data scientist are:
- Technical questions such as regularization and its use, statistical power, difference between ‘long’ and ‘wide’ format data, resampling methods, Edward Tufte’s concept of “chart junk” etc.
- Data scientists and companies/ startups in this space that they admire the most
- Validating a model, they had created to generate a predictive model of a quantitative outcome variable using multiple regression.
- Understanding of the terms such as pricing optimization, price elasticity, inventory management, etc.
- Identifying if a statistics published in an article is wrong or presented to support the author’s point of view
- Recommendation engine and its working
- Difference between false positive and false negative and why is it important to understand the difference.
On a concluding note
Spotting a real data scientist may not be the easiest of tasks, but given the investment that would be made on these professionals in terms of high salaries and efforts in training them, it becomes important to get a thorough understanding of the candidate. He/she should be well versed with applying mathematics, statistics and validating models using proper experimental design. It is important to dig deeper to get and spot the genuine data scientist that you might be looking for.
Try deep learning using MATLAB