1666777204580-1666708806962-Introduction To Data Science REV
1666777204580-1666708806962-Introduction To Data Science REV
Data Science
Mentor: Pararawendy Indarjo
Hey I’m,
Pararawendy Indarjo
I am a,
● CURRENTLY | Lead DS at AlloFresh
● 20 - Jul 22 | Senior DS at Bukalapak
● 19 - 20 | Data Analyst at Eureka.ai
Linkedin :
BSc Mathematics MSc Mathematics
https://www.linkedin.com/in/pararawendy-indarjo/
Blog : medium.com/@pararawendy19
Outline
● Introduction to data science
● Data science methodology
● Data science tools
● Googling tips
● Advices for aspiring data scientists
What isisData
What Data Hi, I’m a
data scientist!
Science?
Science?
Data science is the field of study that combines
domain expertise, programming skills, and
knowledge of mathematics and statistics to
extract meaningful insights from data (Source:
DataRobot)
Two main topics in data
science
1 2
Family of statistical models with ability to modify itself (learn) when exposed
to more data (ref: SkyMind)
Train Data
Model
(Trained)
Training
algorithm
Model
(Raw)
Prediction
Concepts in Machine Learning
Consider a model to predict sales omzet using different advertising channels
Features
Parameters
Target
● To find the model’s parameters, we train the model on an empirical data set
○ Essentially many pairs of values (features, target)
■ socmed = 2, onmedia = 3, DOOH = 1, GMV = 10,
■ socmed = 1, onmedia = 2, DOOH = 1, GMV = 7, etc
Machine Learning Logic
Logic
1. Use past data to make our model learns the feature-target relationship
2. Use the learned model to predict the target variable of new data points
features target
Past
Data
New
?
Data
1 Churn Modelling
● Churn: user leaving the company’s’ products
● We can build a model that predicts whether or not
user will churn
○ So that we can take preventive actions
2 Demand Forecasting
● We train a time series model that forecasts future
Sample of Data demands
● Benefit: maximize potential profit, while minimizing
Science production/maintenance cost
Implementations ● E.g: UHT milk demand forecasting
3 Recommendation Systems
● We train a large matrix that crosses users taste
and available products
● E.g.
○ Youtube video recommendation
○ Spotify song recommendation
○ Etc
Data Science
Methodology
● Programming tool
○ Python ● After the data is ready on our machine
○ R ● We’re ready to analyze/build models from the data
● See next slide for R vs Python comparison
● If your work is around deep-dive data analysis, insight creation and visualization
○ Then both Python and R are equally capable
Tips:
- If you face an error, copy the error message last line, and
search it on Google. Usually, a website called
‘stackoverflow’ has a forum thread on that error.
- Learn to read official documentation of a package.
- If you want to search for Python codes for a particular
problem, try adding “python” at the end of your google
search.
Learn to Google!
Tips:
- To learn modelling study cases and techniques, try to
search your problem and read articles from these websites:
- towards data science
- machine learning mastery
- analytics vidhya
Learn to Google!
Example:
Work as DS in healthcare
industry? Learn about healthcare
Build Your Portfolio