How to Prepare for a Career as a Data Scientist
Today, we are in the midst of an unprecedented explosion in data volume. As per an IBM report from December 2016 titled “10 Key Marketing Trends For 2017,” 90 percent of the world’s data was generated in the preceding two years at the rate of 2.5 quintillion bytes of data a day. That number has only been increasing since then.
Organisations are fast realizing that this massive data, if used prudently, can bring immense benefits. It can improve decision-making, drive greater efficiency and inform strategies for future growth.
One consequence of this has been the steadily growing demand for data scientists and analysts, who can analyse the data and generate actionable insights from it. The McKinsey Global Institute estimates the that US could have as many as 250,000 open data science jobs by 2024. Closer home in India, a study by Edvancer and Analytics India Mag found that the number of new analytics jobs advertised per month increased by almost 76 percent from April 2017 to April 2018. Even our own research confirms this. When we harvested and processed data from millions of job descriptions across market, we found that the demands in data science are increasing at a rapid rate. If we were to look at a location-wise trend, Bangalore stands on top as the hub of machine learning and AI, covering 34% of total jobs created.
By all accounts, data is poised to play a massive role in shaping the future of industry, and possibly the future of humanity itself. Therefore, a career in data science certainly makes for an attractive proposition for any aspiring engineer. The glut of courses in data analytics, AI, machine learning etc. points to the growing popularity of data as a career option.
While the interest in data science is certainly welcome, I have found that there isn’t enough clarity among students and young IT professionals on what it takes to prepare for a career in data science. A common mistake people make is to enrol in online courses that are either too basic or vague to be useful in practice. The danger there is that these courses are likely to get you stuck in dead-end data analytics or AI jobs that are repetitive and provide no scope for learning.
In this article, I’d like to share some tips and advice on what constitutes the right preparation for a fulfilling career in data science.
Before we jump into the professional qualifications/skills required, we need to look at some of the basic traits or qualities that a person needs to possess to thrive in a competitive industry such as data science. Here are some traits that come to mind:
In the industry, it’s quite rare to be presented with a fleshed-out problem statement that needs to be solved using data. Rather, you will likely get your best insights by studying patterns and trends in the reams of data. The ability to find these patterns or anomalies needs a certain basic curiosity and knack for looking beyond the obvious.
When dealing with multiple parameters and data sources, trying to find the exact correlation and causation for any parameter can be confusing and fairly challenging. The ability to use logical reasoning to draw inferences and arrive on a conclusion is invaluable as a data scientist.
3. Big Picture Thinking
When you are analysing large amounts of data, it is easy to get overwhelmed by the findings and insights. But the real value of those insights lies in them being rolled up to the business level so that they have a real impact on decision making. It needs imagination and ability to look at the big picture and draw insights that truly matter.
The potential impact of data science is limited only by imagination. Whether it is bringing down crime rates or reducing traffic woes or saving the planet from climate change, no problem is too big to be solved through data, provided you have the right vision.
Despite all the great things data and AI can do, the ground truth remains that the success rate isn’t always high for data projects. Patience is a key virtue while working with data. You can safely assume that 90 percent of the things that you do will fail to meet their logical conclusion. It’s difficult to accept failure, especially when you put your heart and soul into something, but patience is what will get through it.
5. Love for Mathematics
Needless to say, a lot of data science boils down to mathematical and statistical analysis. While you don’t need to be an expert in advanced mathematics, a basic love for mathematics and a math-oriented approach are crucial. If you love tinkering with data sets etc. it will serve you well in the field of data.
If you think you possess these traits and would like to explore a career in data science, we can move to the next step, which is to understand the skills and knowledge that you need.
Courses and Training
Data scientists need well-rounded knowledge of mathematics, statistics, deep learning, machine learning and NLP. Here are a few recommended courses and training institutes that you can explore.
Linear Algebra, Probability, Statistics:
Natural Language Processing:
Deep learning and AI:
Books are always a great way to shore up knowledge on any subject. Here’s a list of some of the best books on data sciences that I highly recommend reading.
Competitions are a great way to brush up your skills and develop a problem-solving approach. Remember that participation and learning should be the focus. It isn’t about winning and losing; it’s the learning that matters.
As they say, we need to stand on the shoulders of giants to be able to look further. Following some of the greatest minds in data science and understanding their thought process is a great way to keep your knowledge up to date. Here’s an indicative list of some data science experts that every professional must follow.
In the age of data analytics and AI, becoming a data scientist undoubtedly seems like the dream career option for any aspiring technology professional. With the right preparation, only the sky is the limit.
About the author
Sandeep is currently working as a Senior Data Scientist at EdGE Networks. He works on NLP and information retrieval. Learning from small datasets, training interpretable models and representation learning for text are his current interests.