0% found this document useful (0 votes)
5 views24 pages

Data Science Methodology

The document outlines a structured methodology for data science, emphasizing the importance of clearly defining questions and objectives before data analysis. It presents a real-world case study from ISRO, illustrating the stages of data collection, preparation, modeling, and evaluation. The methodology follows the CRISP-DM framework, highlighting the need for a systematic approach to achieve actionable insights and improve results.

Uploaded by

thenaivesamosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views24 pages

Data Science Methodology

The document outlines a structured methodology for data science, emphasizing the importance of clearly defining questions and objectives before data analysis. It presents a real-world case study from ISRO, illustrating the stages of data collection, preparation, modeling, and evaluation. The methodology follows the CRISP-DM framework, highlighting the need for a systematic approach to achieve actionable insights and improve results.

Uploaded by

thenaivesamosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Data

Science
Methodology
From Questions to Insights
– A Complete Data Science
Journey
Introduction
 Data Science ≠ Just Coding
 It follows a structured methodology to turn data into actionable insights.
 A well-defined process ensures success, repeatability, and impact.
 How to Think Like a Data Scientist?
 It’s like cooking—you need the right ingredients (data), preparation
(cleaning), and the right recipe (modeling).
 Real-World Case Study: During my internship at ISRO, I analyzed rocket
telemetry data for PSLV launches.
 It followed the same methodology: collecting, cleaning, modeling, and
evaluating data for mission safety.
 Agenda: We'll explore the 10 key stages of the Data Science Methodology.
Asking Questions
•The foundation of any data science project.
•Clearly define:
•What problem are we solving?
•Who benefits from the solution?
•What decisions need to be made?
•Case Study: ISRO’s PSLV Missions
•Before analyzing PSLV telemetry, we asked:
•Can we predict signal strength dropouts in real time?
•How does the rocket’s exhaust affect radar tracking?
•These questions shaped our entire analysis.
•Cooking Example:
•Before making a dish, you ask: What am I cooking? Who is
eating? What’s the best recipe?
Business Approach
 You don’t start cooking without understanding who’s eating and what they need (spicy,
mild, veg, non-veg).
 Spend as much time as you can to understand and gain clarity.
 Required domain knowledge
 Understand the goal and define the objectives towards the goal.

 Objective: Ensure uninterrupted telemetry & flight safety


 Success Metric: Accurate signal strength predictions
Analytical Approach
•Different types of analytics:
1.Descriptive – "What happened?" (e.g., PSLV signal dropouts)
2.Diagnostic – "Why did it happen?" (e.g., exhaust interference)
3.Predictive – "What will happen next?" (e.g., signal strength
forecasting)
4.Prescriptive – "What should we do?" (e.g., optimize radar
positioning)
•Case Study: ISRO Rocket Telemetry
•We used predictive analytics to estimate when and where signal
strength would drop.
Data Requirements and Collection
•What data do we need?
•Sources:
•Databases
•APIs
•Web scraping
•Surveys
•Challenges: Missing, unstructured, biased data.
•Case Study: PSLV Data Sources
•Data collected from radar stations, telemetry, and on-board
sensors.
•Challenges: Gaps in data due to obstructions and signal noise.
•Cooking Example:
•Ingredients matter! If you don’t have tomatoes for a curry, you
have to find an alternative.
Understanding the Data
•Exploratory Data Analysis (EDA)
•Summary statistics
•Data distributions
•Identifying outliers and missing values
•Case Study: PSLV Signal Data
•Checked for missing timestamps & abnormal signal spikes.
•Used visualization tools to detect patterns.
•Cooking Example:
•Before cooking, you inspect ingredients—checking for spoilage,
ripeness, or unwanted elements.
Data Preparation
•Key Steps:
•Handle missing values (imputation, removal)
•Feature engineering (creating new meaningful variables)
•Scaling and encoding categorical variables
•Case Study: PSLV Data Cleaning
•Removed faulty data points (sudden signal spikes due
to noise).
•Created new features like rate of signal loss to improve
predictions.
•Cooking Example:
•You can’t cook with unwashed vegetables or uncut meat
—cleaning is essential!
Data Modelling

•Selecting Machine Learning Models:


•Classification (Logistic Regression, Decision Trees)
•Regression (Linear Regression)
•Clustering (K-Means)
•Case Study: Machine Learning at ISRO
•Used regression models to predict signal strength over time.
•Compared models using Mean Absolute Error (MAE).
Model Evaluation

•How do we measure success?


•Accuracy, Precision, Recall (Classification)
•RMSE, MAE (Regression)
•Avoiding Overfitting & Bias
•Case Study: ISRO’s Model Evaluation
•Compared actual vs. predicted signal strength to
refine the model.
Deployment And Feedback
•Deploying models via:
•APIs
•Cloud platforms (AWS, GCP, Azure)
•On-premise solutions
•Case Study: PSLV Model Deployment
•Integrated our model into a real-time telemetry dashboard.
•Scientists could see predicted signal strength before launch.
•Cooking Example:
•Serving the dish is like deploying a model. You need feedback from
diners (users) to improve it next time!
CRISP DM
• Cross-Industry Standard Process for Data Mining
• A structured 6-phase process:
• Business Understanding
• Data Understanding
• Data Preparation
• Modeling
• Evaluation
• Deployment
• Case Study: ISRO followed CRISP-DM
• Iterated through data preparation, modeling, and evaluation to refine predictions.
To Conclude
•Data Science Methodology is crucial for
structured problem-solving.
•A well-defined approach reduces risk and
improves results.
•Takeaways:
•Think like a chef—plan, clean, prepare, test,
and serve!
•Follow the 10-stage process for better results.

You might also like