Data Science Process

Uploaded by

krishnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Data Science Process

Uploaded by

krishnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Data Science Process

Data science process consists of six stages :

1. Discovery or Setting the research goal
2. Retrieving data
3. Data preparation
4. Data exploration
5. Data modeling
6. Presentation and automation
• Step 1: Discovery or Defining research goal

• This step involves acquiring data from all the identified internal and
external sources, which helps to answer the business question.
• Step 2: Retrieving data
• It collection of data which required for project. This is the process of
gaining a business understanding of the data user have and
deciphering what each piece of data means.
• This could entail determining exactly what data is required and the
best methods for obtaining it.
• If we have given a data set from a client, for example, we shall need to
know what each column and row represents.
• Step 3: Data preparation
• Data can have many inconsistencies like missing values, blank
columns, an incorrect data format, which needs to be cleaned.
• We need to process, explore and condition data before modeling. The
cleandata, gives the better predictions.
• Step 4: Data exploration
• Data exploration is related to deeper understanding of data.
• Try to understand how variables interact with each other, the
distribution of the data and whether there are outliers.
• To achieve this use descriptive statistics, visual techniques and simple
modeling.
• This steps is also called as Exploratory Data Analysis.
• Step 5: Data modelling
• In this step, the actual model building process starts. Here, Data
scientist distributes datasets for training and testing.
• Techniques like association, classification and clustering are applied to
the training data set. The model, once prepared, is tested against the
"testing" dataset.
• Step 6: Presentation and automation
• Deliver the final baselined model with reports, code and technical
documents in this stage.
• Model is deployed into a real-time production environment after
thorough testing.
• In this stage, the key findings are communicated to all stakeholders
• This helps to decide if the project results are a success or a failure
based on the inputs from the model.

DSUR_EA2352001010391_W3
No ratings yet
DSUR_EA2352001010391_W3
3 pages
Unit 2 - DS - 1st year
No ratings yet
Unit 2 - DS - 1st year
7 pages
Data Science
No ratings yet
Data Science
5 pages
Module1 Data Science
No ratings yet
Module1 Data Science
15 pages
EBook - Data Science 4
No ratings yet
EBook - Data Science 4
14 pages
Data processes
No ratings yet
Data processes
4 pages
Data Science Lifecycle
No ratings yet
Data Science Lifecycle
3 pages
Data Science PDF
No ratings yet
Data Science PDF
11 pages
6001_DATASCIENCE WITH BIGDATA
No ratings yet
6001_DATASCIENCE WITH BIGDATA
34 pages
DTS Modul Data Science Methodology
100% (1)
DTS Modul Data Science Methodology
56 pages
Data Science Methodology
No ratings yet
Data Science Methodology
21 pages
Unit-2 - DS Notes
No ratings yet
Unit-2 - DS Notes
22 pages
DS Unit 2
No ratings yet
DS Unit 2
7 pages
Life Cycle of Data Science - Complete Step-By-step Guide
No ratings yet
Life Cycle of Data Science - Complete Step-By-step Guide
3 pages
Architecture of Data Science Projects: Components
No ratings yet
Architecture of Data Science Projects: Components
4 pages
Data Science Methodology
No ratings yet
Data Science Methodology
4 pages
Data-Science
No ratings yet
Data-Science
14 pages
Exporatory Data Analytics Notes ME SEM 2
No ratings yet
Exporatory Data Analytics Notes ME SEM 2
132 pages
Unit-2
No ratings yet
Unit-2
21 pages
Week 3
No ratings yet
Week 3
3 pages
Data Science Life Cycle
No ratings yet
Data Science Life Cycle
7 pages
Week 3 - LAQ
No ratings yet
Week 3 - LAQ
5 pages
Data Science-Lec 1
No ratings yet
Data Science-Lec 1
17 pages
Data Science Process Stages Lecture 2
No ratings yet
Data Science Process Stages Lecture 2
4 pages
Data Science Methodology
No ratings yet
Data Science Methodology
26 pages
Introduction Data Science Edited
No ratings yet
Introduction Data Science Edited
33 pages
22UCS303 DS-Unit II-N
No ratings yet
22UCS303 DS-Unit II-N
71 pages
Unit-1 Data Science
No ratings yet
Unit-1 Data Science
74 pages
Ads TopperSh
No ratings yet
Ads TopperSh
50 pages
Unit - 1
No ratings yet
Unit - 1
25 pages
Life Cycle of DS Project
No ratings yet
Life Cycle of DS Project
9 pages
QB_ESE_FDS
No ratings yet
QB_ESE_FDS
29 pages
Data Science Tools Final
No ratings yet
Data Science Tools Final
11 pages
HTTTTC- FINAL EXAM
No ratings yet
HTTTTC- FINAL EXAM
4 pages
IDS - UNIT-2 - Notes part1_Introduction to Data Science and Prob concept[1]
No ratings yet
IDS - UNIT-2 - Notes part1_Introduction to Data Science and Prob concept[1]
66 pages
Topper World Data-Science-Lifecycle-Fnl
No ratings yet
Topper World Data-Science-Lifecycle-Fnl
6 pages
5 Data Science Project Lifecycle
No ratings yet
5 Data Science Project Lifecycle
33 pages
DS
No ratings yet
DS
94 pages
MLM FDS
No ratings yet
MLM FDS
19 pages
Data Science Methodology.pptx
No ratings yet
Data Science Methodology.pptx
14 pages
ADS-IMP-QNA-2025-15-04-06-06-35_copy
No ratings yet
ADS-IMP-QNA-2025-15-04-06-06-35_copy
33 pages
Data Science
No ratings yet
Data Science
11 pages
Data Science
No ratings yet
Data Science
11 pages
Statictics Computerscience Information Science
No ratings yet
Statictics Computerscience Information Science
3 pages
Chapter 1- Intr to DS and Business Understanding
No ratings yet
Chapter 1- Intr to DS and Business Understanding
35 pages
Data Science (Quick Guide) for College Exams
No ratings yet
Data Science (Quick Guide) for College Exams
34 pages
Bd4151 Foundations of Data Science
No ratings yet
Bd4151 Foundations of Data Science
70 pages
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
No ratings yet
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
6 pages
dsbd
No ratings yet
dsbd
23 pages
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
No ratings yet
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
28 pages
FDS notes
No ratings yet
FDS notes
5 pages
Data Analytics 1
No ratings yet
Data Analytics 1
4 pages
Unit I and unit ii dev (1)
No ratings yet
Unit I and unit ii dev (1)
36 pages
UNIT- I
No ratings yet
UNIT- I
17 pages
M1 - FDS
No ratings yet
M1 - FDS
19 pages
CHAPTER 1
No ratings yet
CHAPTER 1
85 pages
DataSci Document
No ratings yet
DataSci Document
7 pages
Bsd1313 Chapter 3
No ratings yet
Bsd1313 Chapter 3
74 pages
Elicitation Techniques for Business Analysis
From Everand
Elicitation Techniques for Business Analysis
Kadir Çamoğlu
No ratings yet
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet

Data Science Process

Uploaded by

Data Science Process

Uploaded by

Data Science Process

Data science process consists of six stages :

You might also like