We have often heard what to expect from a data scientist. But we never hear back from this cadre of “hottest job” holders as to what their biggest challenges at the workplace are. Inspired by a LinkedIn post by Imaad Mohammed Khan, Data Analyst at Indegene, Analytics India Magazine lists down some of the biggest issues faced by data scientists today. Data Scientists are expected to have a broader set of skills, which is realistically not possible. We believe a time will come when we expect more specialisation and collaboration by data scientists, rather than expecting one person to know everything.
Part of the problem lies in the hype around data science. Ever since data science became a buzzword, we have been inundated with information and hype about books, MOOCs and articles billed as the most comprehensive information on how to build a career in this field. Not only that, the more one dives into the “datasphere”, the more pressure data analysts face to keep up with the latest big data tools and shiny deep learning techniques.
On the enterprise end, a slew of sales-driven organisations are turning to data analytics to help them improve profit. Business leaders today recognise the need for data analytics and how companies lack front-line skillset to build advanced models. Organisations are also mulling over an approach where they are able to understand the right path to deploy a data science strategy over time, not just to calibrate models. And with deep learning becoming the buzzword du jour, business leaders should recognise when a simple linear regression will do a better job than a neural network.
What Organisations And Mid-Sized Companies Are Grappling With
- Businesses are facing an increased pressure to do “data science” without a real understanding of its suitability or value, which ends up resulting in everything being data science.
- There is pressure from the data science community to be very precise as to what counts as data science and what doesn’t, which ends up in this huge movement towards aspiring data scientists focusing on the newest, most cutting-edge models regardless of applicability.
Challenges Faced By Data Scientists
Ideally, data scientists are expected to have a broad set of skills: Data Scientists are expected to have a broad set of skills but they can’t be proficient in everything. In fact, the field is getting increasingly fuzzy with no concrete definition of the job. Roles can range from doing advanced machine learning to data munging without requiring any model building.
Data Scientists cannot develop domain expertise overnight: As one LinkedIn user pointed out, data scientists today face unrealistic expectations to create impact overnight even without the right resources. For example, a deep understanding of the business context is something that is acquired incrementally over long periods of time and can only be achieved by engaging in a constant feedback loop with stakeholders from all levels.
Data Scientists cannot improve an entire lifecycle of a problem: Someone that just does data science can take a well-defined problem statement, create a model, get a solution/output, and pass that along to the next person. Data Scientists can only identify problems in the business and proactively suggest ways in which they can be addressed with data.
Data Scientists cannot produce weekly data insights: One of the biggest challenges data scientists face at work are drumming up daily/weekly “important data insights” while working on a predictive model, LinkedIn member cited. He emphasised that insights should come as recommendations from data scientists, rather than a weekly requirement from project management. Besides, deriving actionable, operational and profit increasing recommendation from an analysis is a complex, hectic, time-consuming job and should not be mixed with the daily task list.
Data Scientists cannot work on low data: In real AI life, the challenge is not to build a model and feed data into it. The challenge is after feeding data into it, being critical of your own model, and discover data problems from the result. In real life, no data is clean. Without detective instinct, your model will be just garbage no matter how hard you try. Also, domain knowledge experts are required along to get the correct data.
ML/AI algorithms are not one-trick ponies: Because you need many years of intuition building and deep understanding to be really good at this. No single real problem can be solved by a single existing ML/AI algorithm. It always requires constant re-working and it also requires deep expertise and understanding. Without the experience behind deep learning, you’ll be a one-trick pony, and that one-trick wears out very quickly.
Not every problem requires a complex deep learning model: A lot of businesses/startups need to get off the deep learning hype train and understand what tools are best for the business. For example, not every business problem requires complex deep learning models. Analysts usually believe in trying simple models like Naive Bayes, Logistic Regression or SVM first since simpler models are easier to evolve.
Evolution Of The Data Scientist’s Role
- Create a business case that allows more senior/non-technical people at the company to understand what is the potential value, risk, cost of implementing this model
- Create a project plan, including resource count, IT requirements
- Meet domain experts to understand the nuances of the problem and what needs to be accounted for when formulating the model
- Increasingly, data scientists are expected to create executive-level briefs/presentations for the C-suite so that they can understand what’s being done. During the presentation, the analytics team is expected to convey the trade-offs in a way so that non-technical people can grasp the functioning
- Once the model is created and the output is ready, the team has to develop a validation/pilot/rollout plan to ensure it gets executed appropriately in order to deliver full impact for the business problem
Try deep learning using MATLAB