0% found this document useful (0 votes)
31 views27 pages

Data Science RoadMap Min

Uploaded by

harikadaing123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views27 pages

Data Science RoadMap Min

Uploaded by

harikadaing123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Data Science RoadMap

Masoud Mazloom
20-07-2022
Data is the currency of the digital economy
Get the analytical skills you need to cash in

Although data is the lifeblood of the digital economy, many companies are
blind to the value of the data they create. It’s time for that to change.
• Interdisciplinary field that focuses on
analyzing massive amounts of data to
automatically identify inherent patterns,
extract underlying models, and make
relevant predictions.

• Impacting virtually all areas of the economy,


including science, engineering, medicine,
banking, finance, sports and the arts.

• Exciting real-world applications include credit


card fraud detection, speech recognition,
predictive medical diagnosis, and self-driving
cars.
We will tell you how does it really work under the hood!
What is the data science learning roadmap?
• Charts out multi-level skills map with details on
• What skills you want to hone,
• How you will measure the outcome at each level
• and techniques to further master each skill
Complexity and common usage
• Weights to each level based on the complexity and commonality of
application in the real-world
Programming or software engineering
• Every data science job description would • Resources for python:
ask for programming expertise in at least • learnpython.org
one of the languages (Python/R)
• Kaggle
• Common data structures(data types, lists,
dictionaries, sets, tuples), writing functions, logic, • freecodecamp on YouTube
control flow, searching and sorting algorithms,
object-oriented programming, and working with • SQL:
external libraries • Intro to SQL and Advanced SQL on Kaggle
• SQL scripting: Querying databases using • Datacamp also offers many courses on SQL
joins, aggregations, and subqueries • Git:
• Comfortable with using the Terminal, • Guide for Git and GitHub
version control in Git, and using GitHub
Data collection, extraction and wrangling
• A significant part of the data science work is centered around finding apt data
that can help you solve your problem
• You can collect data from different legitimate sources:
• scraping(if the website allows)
• APIs
• Databases
• Publicly available repositories
• Data is rarely clean and formatted for use in the “real world”. Pandas and NumPy
are the two libraries that are at your disposal to go from dirty data to ready-to-
analyze data.
• Resources:
• Data Manipulation using pandas
• Data Cleaning course by Kaggle
• freecodecamp course on learning Numpy, pandas
Exploratory Data Analysis
• Drawing insights from data and communicating to the management in simple terms
• Exploratory data analysis:
• Defining questions, handling missing values, outliers, formatting, filtering, univariate and multivariate analysis
• Data visualization:
• Plotting data using libraries like matplotlib
• Knowledge to choose the right chart to communicate the findings from the data
• Developing dashboards:
• Use Excel or a specialized tool like Power BI and Tableau to build dashboards that summarize/aggregate data to help
the management in making decisions
• Business acumen:
• Work on asking the right questions to answer, ones that actually target the business metrics
• Practice writing clear and concise reports, blogs, and presentations
• Resources:
• Career track on Data Analysis
• Data Analysis with Python
• Data Visualization in Spreadsheets, Excel, Tableau, Power BI
Statistics and Mathematics
• Statistical methods are a central part of data science
• Focus more on descriptive and inferential statistics
• Descriptive Statistics: to be able to summarise the data is powerful but not always. Learn about
estimates of location(mean, median, mode, weighted statistics, trimmed statistics), and variability to
describe the data.
• Inferential statistics: designing hypothesis tests, A/B tests, defining business metrics, analyzing the
collected data and experiment results using confidence interval, p-value, and alpha values.
• Linear Algebra, Single and multi-variate calculus to understand loss functions, gradient, and optimizers
in machine learning.

• Resources:
• [Book]Practical statistics for data science(highly recommend)
• Statistical thinking in Python
• Intro to Descriptive Statistics
• Inferential Statistics
• Probability and Statistics for Data Science (Series) on Medium
• Three Blue One Brown Lecture Series
Machine learning
• There are three major types of learning:
1. Supervised Learning — includes regression and classification problems. Study simple linear
regression, multiple regression, polynomial regression, naive Bayes, logistic regression, KNNs,
tree models, ensemble models. Learn about evaluation metrics.
2. Unsupervised Learning — Clustering and dimensionality reduction are the two widely used
applications of unsupervised learning. Dive deep into PCA, K-means clustering, hierarchical
clustering, and gaussian mixtures.
3. Reinforcement learning(can skip*) — helps you build self-rewarding systems. Learn to
optimize rewards, using the TF-Agents library, creating Deep Q-networks, etc.

• Resources:
• [book]Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
• [book]Pattern recognition and Machine Learning
• Machine Learning Course by Andrew Ng
• Introduction to Machine Learning
• Supervised learning with Python
Soft skills (people behavior skills)
• Commonly used to “refer to the “emotional side” of human beings in opposition to the IQ
• Character traits and interpersonal skills that characterize a person's relationships with others
• Help employees interact with others and succeed in the workplace
• Describe a person's emotional quotient (EQ) as opposed to intelligence quotient (IQ)
• Soft skills include:
• Communication skills
• Mentor your coworkers
• Leadership skills
• Follow instructions, and get a job done on time
• Team building and Teamworking skills
• Problem-solving skills
• Analytical skills
• Collaboration

97% of employers say that soft skills are either as important or more important than hard skills
80% of companies' success is due to soft skills
Resources
• https://towardsdatascience.com/data-science-learning-roadmap-for-2021-84f2ba09a44f
• https://skaf.medium.com/data-scientist-roadmap-2022-3e247fe6fe87
• https://www.mltut.com/data-science-with-python-roadmap/
• https://towardsdatascience.com/become-a-data-scientist-in-2022-a-practical-52-week-course-
8244cc18284e
• https://omdena.com/blog/data-science-road-map/

You might also like