
General
A Data scientist uses Data and AI to solve business problems, is skilled at working with data, extract meaningful insights, using ML to solve business problems, build applications that make predictions and recommendations, deploy and monitor the solutions.
A Data scientist uses Data and AI to solve business problems, is skilled at working with data, extract meaningful insights, using ML to solve business problems, build applications that make predictions and recommendations, deploy and monitor the solutions.
Data scientist is a relatively a new profession.
By Data Scientists, I refer not just the individual job title, but the whole job family. This includes, ML Engineers, ML Scientists, Research scientists etc, who are primarily concerned with solving business problems using ML.
The job of Data Scientists is very interesting for various reasons. First, you get to work on intellectually stimulating problems that matter.
Data Scientists execute high value projects. They do this by building models that may either predict something (Eg: what product to recommend to a given user) or provide valuable insights that drive business decisions scientifically (Eg: Should I introduce this feature in product design?).
Such decisions bring value for companies.
Data scientists working for Product companies and proven your skills, will also have:
Of course this also means fast career growth. You get to work on things that matter for the company, often having a strong impact on the revenue generation, which easily justifies the large salaries Data Scientists are paid.
Companies have realized the benefits of using data and AI to drive business decisions, optimize operations and integrate AI into existing products or build new ones.
Storing large amount of data was a big deal few years back, but now with big data frameworks and cloud computing services processing petabytes of data is not uncommon.
Covering at an overall level, both Senior and junior level Data Scientists.
Some of the main responsibilities of Data Scientists are as follows:
Initially Data Scientists start out into single stream, but post mid career, you may choose to be in one of the two routes:
1. Technical / Research Path
2. Managerial Path
Which one is better, it’s usually a personal preference. Financially they are equally rewarding as well as far as Data Science is concerned, because of the value they bring to the table.
Variation 1:
Data Analyst ($90k) –> Junior Data Scientist ($110k) –> Data Scientist ($125k) –> Senior Data Scientist ($150k) –> Principal / Staff Data Scientist ($180k) –> Chief Data Scientist ($230k) –> CTO / CIO / CDO / CEO
Variation 2:
Data Scientist 1 –> Data Scientist 2 –> Data Scientist 3 –> Data Scientist 4 –> CTO or CIO or CDO –> CEO
Note:
Companies give various titles as per their organizational policies.
Sometimes junior positions can start directly as Junior Data Scientist, ML Engineer, Quantitative Analyst etc. That is, Data Analyst may be skipped. Also, mid/senior positions can have other titles such as ‘Research Scientist’, ‘Applied ML Scientist’ etc.
Junior Data Scientist –> Data Scientist –> Senior Data Scientist –> Data Science manager ($175k) -> Senior Data Science Manager –> Director of Data Science ($200k) –> VP, Data Science ($230k)
Also Read: The Missing Data Scientist Roadmap – 6 month Self Study plan on how to become a Data Scientist
The general ask of Data Scientist skill sets from companies standpoint is as follows.
Data Scientists in general are also expected to be savvy with:
Typically Bachelors / Masters / PhD in quantitative discipline (Engineering, Statistics, Operations research, Computer science, Mathematics, Physics, Economics etc).
Quantitative discipline is preferred to make sure candidates have the exposure and aptitude. However, there are several cases where Science and commerce graduates have take up Data Science. Education is usually not a barrier for able candidates.
Knowledge of ML algorithms, Probability and Statistics and Deep learning are the main areas for Data Scientists.
However, what you should also know is, not all Data Scientists in industry need to be proficient in all of the areas. These things are highly team and projects specific.
There are teams that don’t use deep learning at all to solve problems, and there are teams that use Deep Learning for everything. Not all projects get deployed in a AWS or equivalent cloud infrastructure as well.
What are non-negotiable skills for Data Scientists?
Being able to collaborate with stakeholders and convey your results and findings effectively will help you grow faster in career.
This again can vary with company culture, teams, projects and seniority levels. But there are common themes that Data Scientists involve in.
Have weekly calls with your key client stakeholder to catch up and the review the progress. Make sure everything is on track, share insights from the modeling and data analysis, discuss roadblocks and potential risks that you foresee, discuss what other problems they face and how Data Science can help etc.
Write SQL / PySpark / Python code to gather data from one or several data sources. Map the data together in logical fashion, create data pipelines with tools like AirFlow, Dagster if needed.
Process, cleanse, transform data and store in appropriate place (relational DB, AWS S3, On-premise computer, local etc) for reuse.
Deep dive and analyze the data to find patterns and extract insights.
Build ML / Stats / Deep Learning models to make predictions, validate the models, prepare results in presentable format.
Attend to ad-hoc requests like a quick win projects, AB tests design and analysis, building baseline models, PoCs and feasibility studies, checking data availability, model re-training, enhancements, production bug fixes in data / model pipelines etc.
Interact with your team members to share project knowledge, gather ideas and brainstorm.
Interact, take sessions for non-Data science members on use cases that can be implemented in their functions.
Have monthly / quarterly calls with senior leaders to catch up, discuss overall progress and goals, strategies for growth and new initiatives you / your team can take up.
Data Scientists work in all sorts of organizations from Startups to Consulting firms to Product companies.
They are present in all industries as well, from manufacturing, healthcare, e-Commerce, retailers, automotive, FMCG, finance and banking, political parties, United Nations and Government organizations.
This may be true since I’ve seen this complaint too often, but working that way is highly inefficient.
There can be various reasons: you are probably understaffed (have a dedicated Data Engineer on the team), your company has poor data infrastructure, data is highly unorganized, scattered cross various systems or not present that you have to scrape/acquire it from other sources.
Or you simply you need to introspect and learn how to run a DS project efficiently.
Do you want to become a Data Scientist? ML+ offers comprehensive ML Mastery learning path. I’ve designed this for optimal learning, which you can complete in about 6 moths time. Start taking the courses in sequence as per the learning path, my team and I are there to support you and clear all your doubts.
Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.
Start Free Course →Get the exact 10-course programming foundation that Data Science professionals use.