0% found this document useful (0 votes)
15 views3 pages

Xii Analytical Approach

The document outlines a five-part analytical approach in data science methodology, including stages from problem definition to feedback. Each part emphasizes the importance of business understanding, data collection, preparation, modeling, evaluation, deployment, and iterative feedback. The process is designed to ensure that data scientists effectively address business problems using appropriate statistical and machine learning techniques.

Uploaded by

priyansh23fe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views3 pages

Xii Analytical Approach

The document outlines a five-part analytical approach in data science methodology, including stages from problem definition to feedback. Each part emphasizes the importance of business understanding, data collection, preparation, modeling, evaluation, deployment, and iterative feedback. The process is designed to ensure that data scientists effectively address business problems using appropriate statistical and machine learning techniques.

Uploaded by

priyansh23fe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

ANALYTICAL APPROACH (DATA SCIENCE METHODOLOGY)

There are five parts, each of which contains more steps:


1. From Problem to Approach
2. From Requirements to Collection
3. From Understanding to Preparation
4. From Modeling to Evaluation
5. From Deployment to Feedback

1. From Problem to Approach

Business Understanding-Every project, regardless of its size, starts with business


understanding, which lays the foundation for successful resolution of the business problem. The
business sponsors needing the analytic solution play the critical role in this stage by defining the
problem, project objectives and solution requirements from a business perspective. And, believe
it or not—even with nine stages still to go—this first stage is the hardest.

After clearly stating a business problem, the data scientist can define the analytic approach to
solving it. Doing so involves expressing the problem in the context of statistical and machine
learning techniques so that the data scientist can identify techniques suitable for achieving the
desired outcome. Selecting the right analytic approach depends on the question being asked.
Once the problem to be addressed is defined, the appropriate analytic approach for the problem is
selected in the context of the business requirements.

2. From Requirements to Collection

Data Requirements is the stage where we identify the necessary data content, formats, and
sources for initial data collection. This includes 5W1H approach.

 In the Data Collection Stage, data scientists identify the available data resources relevant to
the problem domain.

3. From Understanding to Preparation

 Now that the data collection stage is complete, data scientists use descriptive statistics and
visualization techniques to understand data better. Data scientists, explore the dataset to
understand its content, , quality, and initial insights about the data. Gaps in data will be
identified and plans to either fill or make substitutions will have to be made. They determine
if additional data is necessary to fill any gaps but also to verify the quality of the data.

 In the Data Preparation stage, data scientists prepare data for modeling, by cleaning the
data and make it error free for use during modelling.

From Modeling to Evaluation

Once data are prepared for the chosen machine learning algorithm, we are ready for modeling.

 Modeling focuses on developing models that are either descriptive or predictive, and these
models are based on the analytic approach that was taken statistically or through machine
learning. Descriptive modeling is a mathematical process that describes real-world events
and the relationships between factors responsible for them, for example, a descriptive model
might examine things like: if a person did this, then they’re likely to prefer that. Predictive
modeling is a process that uses data mining and probability to forecast outcomes; for
example, a predictive model might be used to determine whether an email is a spam or not.
For predictive modeling, data scientists use a training set that is a set of historical data in
which the outcomes are already known. This step can be repeated more times until the model
understands the question and answer to it.
 In the Model Evaluation stage, data scientists can evaluate the model in two ways: Hold-Out
and Cross-Validation. In the Hold-Out method, the dataset is divided into three subsets:
a training set as we said in the modeling stage; a validation set that is a subset used to
assess the performance of the model built in the training phase; a test set is a subset to
evaluate the likely future performance of a model.

From Deployment to Feedback

 The Deployment stage depends on the purpose of the model, and it may be rolled out to a
limited group of users or in a test environment.

 The Feedback stage is usually made the most from the customer. Customers after the
deployment stage can say if the model works for their purposes or not. Data scientists take
this feedback and decide if they should improve the model; that’s because the process from
modeling to feedback is highly iterative.

You might also like