0% found this document useful (0 votes)

68 views6 pages

Data Science: Lesson 5

1. The document describes the key activities in Phase 3 (Model Planning) and Phase 4 (Model Building) of the data analytics life cycle. 2. In Phase 3, the data science team identifies candidate models to apply to the data based on hypotheses from Phase 1. They explore relationships between variables to select important predictors and determine if a single model or multiple models are needed. 3. Phase 4 involves developing training and test datasets to build analytical models, evaluate their performance, and refine them to optimize results. Thorough documentation of modeling decisions and assumptions is important.

Uploaded by

lia immie rigo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views6 pages

Data Science: Lesson 5

Uploaded by

lia immie rigo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

DATA ANALYTICS LIFE CYCLE

PART 2
Module Objectives
At the end of this module, students must be able to:
1. describe the processes involves in model planning such as data exploration, variable and model selection;
2. enumerate the key decisions needed to finalize the model as well as the tools available for model building;
3. discuss the importance of communicating the results obtained to key stakeholders;
4. describe the steps in operationalizing the results;
Recap:
From previous discussion, we learn about
1. an overview of the data analytics life cycle;
2. the seven key roles in an analytics project;
3. the discovery phase (phase 1) where data science team learns about the business domain, assesses
resources available as well as formulate initial hypotheses to test in learning about the data.
4. the data preparation phase (phase 2) about preparation of the analytic sandbox, performing ETLT, data
conditioning, etc.
Phase 3 – Model Planning

✔ In Phase 3, the data science team identifies candidate models to apply to the data for clustering,
classifying, or finding relationships in the data depending on the goal of the project, as shown

✔ It is during this phase that the team refers to the hypotheses developed in Phase 1, when they first
became acquainted with the data and understanding the business problems or domain area.

✔ These hypotheses help the team frame the analytics to execute in Phase 4 and select the right methods to
achieve its objectives.
Some of the activities to consider in this phase include the following:

✔ Assess the structure of the datasets. The structure of the datasets is one factor that dictates the tools and
analytical techniques for the next phase. Depending on whether the team plans to analyze textual data or
transactional data, for example, different tools and approaches are required.

✔ Ensure that the analytical techniques enable the team to meet the business objectives and accept or reject
the working hypotheses.

✔ Determine if the situation warrants a single model or a series of techniques as part of a larger analytic
workflow.

✔ In addition to the considerations just listed, it is useful to research and understand how other analysts
generally approach a specific kind of problem.

✔ Given the kind of data and resources that are available, evaluate whether similar, existing approaches will
work or if the team will need to create something new. Many times teams can get ideas from analogous
problems that other people have solved in different industry verticals or domain areas.

✔ Table 2-2 summarizes the results of an exercise of this type, involving several domain areas and the types
of models previously used in a classification type of problem after conducting research on churn models in
multiple industry verticals.
✔ Performing this sort of diligence gives the team ideas of how others have solved similar problems and
presents the team with a list of candidate models to try as part of the model planning phase.

Model Planning - Data Exploration and Variable Selection

✔ In Phase 3, the objective of the data exploration is to understand the relationships among the variables to
inform selection of the variables and methods and to understand the problem domain. As with earlier
phases of the Data Analytics Lifecycle, it is important to spend time and focus attention on this preparatory
work to make the subsequent phases of model selection and execution easier and more efficient.

✔ A common way to conduct this step involves using tools to perform data visualizations. Approaching the
data exploration in this way aids the team in previewing the data and assessing relationships between
variables at a high level.

✔ As the team begins to question assumptions and test initial ideas of the project sponsors and
stakeholders, it needs to consider the inputs and data that will be needed, and then it must examine
whether these inputs are actually correlated with the outcomes that the team plans to predict or analyze.

✔ Some methods and types of models will handle correlated variables better than others. Depending on what
the team is attempting to solve, it may need to consider an alternate method, reduce the number of data
inputs, or transform the inputs to allow the team to use the best method for a given business problem.

✔ The key to this approach is to aim for capturing the most essential predictors and variables rather than
considering every possible variable that people think may influence the outcome.

✔ Approaching the problem in this manner requires iterations and testing to identify the most essential
variables for the intended analyses. The team should plan to test a range of variables to include in the
model and then focus on the most important and influential variables.

✔ If the team plans to run regression analyses, identify the candidate predictors and outcome variables of the
model. Plan to create variables that determine outcomes but demonstrate a strong relationship to the
outcome rather than to the other input variables. This includes remaining vigilant for problems such as
serial correlation, multicollinearity, and other typical data modeling challenges that interfere with the validity
of these models

Model Planning – Model Selection

✔ In the model selection subphase, the team’s main goal is to choose an analytical technique, or a short list
of candidate techniques, based on the end goal of the project.
✔ For the context of this book, a model is discussed in general terms. In this case, a model simply refers to
an abstraction from reality. One observes events happening in a real-world situation or with live data
and attempts to construct models that emulate this behavior with a set of rules and conditions.

✔ In the case of machine learning and data mining, these rules and conditions are grouped into several
general sets of techniques, such as classification, association rules, and clustering. When reviewing this
list of types of potential models, the team can winnow down the list to several viable models to try to
address a given problem.

✔ An additional consideration in this area for dealing with Big Data involves determining if the team will be
using techniques that are best suited for structured data, unstructured data, or a hybrid approach.

✔ Lastly, the team should take care to identify and document the modeling assumptions it is making as it
chooses and constructs preliminary models.

✔ Typically, teams create the initial models using a statistical software package such as R, SAS, or Matlab.
Although these tools are designed for data mining and machine learning algorithms, they may have
limitations when applying the models to very large datasets, as is common with Big Data.

Phase 4 – Model Building

✔ In Phase 4, the data science team needs to develop datasets for training, testing, and production
purposes. These datasets enable the data scientist to develop the analytical model and train it (“training
data”), while holding aside some of the data (“hold-out data” or “test data”) for testing the model.

✔ During this process, it is critical to ensure that the training and test datasets are sufficiently robust for the
model and analytical techniques. A simple way to think of these datasets is to view the training dataset for
conducting the initial experiments and the test sets for validating an approach once the initial experiments
and models have been run.

✔ In the model building phase, as shown, an analytical model is developed and fit on the training data and
evaluated (scored) against the test data.

✔ The phases of model planning and model building can overlap quite a bit, and in practice one can iterate
back and forth between the two phases for a while before settling on a final model.

✔ Although the modeling techniques and logic required to develop models can be highly complex, the actual
duration of this phase can be short compared to the time spent preparing the data and defining the
approaches.

✔ In general, plan to spend more time preparing and learning the data (Phases 1–2) and crafting a
presentation of the findings (Phase 5). Phases 3 and 4 tend to move more quickly, although they are more
complex from a conceptual standpoint. As part of this phase, the data science team needs to execute the
models defined in Phase 3.

✔ During this phase, users run models from analytical software packages, such as R or SAS, on file extracts
and small datasets for testing purposes. In addition, we assess the validity of the model and its results as
well as determine if the model accounts for most of the data and has robust predictive power.

✔ Also, at this point, we refine the models to optimize the results, such as by modifying variable inputs or
reducing correlated variables where appropriate. In Phase 3, the team may have had some knowledge of
correlated variables or problematic data attributes, which will be confirmed or denied once the models are
actually executed.

✔ When immersed in the details of constructing models and transforming data, many small decisions are
often made about the data and the approach for the modeling. These details can be easily forgotten once
the project is completed. Therefore, it is vital to record the results and logic of the model during this phase.
In addition, one must take care to record any operating assumptions that were made in the modeling
process regarding the data or the context.
Creating robust models that are suitable to a specific situation requires thoughtful consideration to ensure the
models being developed ultimately meet the objectives outlined in Phase 1. Questions to consider include these:

✔ Does the model appear valid and accurate on the test data?

✔ Does the model output/behavior make sense to the domain experts? That is, does it appear as if the
model is giving answers that make sense in this context?

✔ Do the parameter values of the fitted model make sense in the context of the domain?

✔ Is the model sufficiently accurate to meet the goal?

✔ Does the model avoid intolerable mistakes?

✔ Are more data or more inputs needed? Do any of the inputs need to be transformed or eliminated?

✔ Will the kind of model chosen support the runtime requirements?

✔ Is a different form of the model required to address the business problem? If so, go back to the model
planning phase and revise the modeling approach.

✔ Once the data science team can evaluate either if the model is sufficiently robust to solve the problem or if
the team has failed, it can move to the next phase in the Data Analytics Lifecycle.

✔ There are many tools available to assist in this phase, focused primarily on statistical analysis or data
mining software. Common tools in this space include, but are not limited to, the following:

Commercial Tools: Free or Open-Source tools:

1. SAS Enterprise Miner 1. R and PL/R
2. SPSS Modeler 2. Octave
3. Matlab 3. WEKA
4. Alpine Miner 4. Python
5. STATISTICA 5. SQL
6. Mathematica

Phase 5 – Communicate the Results

✔ After executing the model, the team needs to compare the outcomes of the modeling to the criteria
established for success and failure.

✔ In Phase 5, as shown, the team considers how best to articulate the findings and outcomes to the various
team members and stakeholders, taking into account caveats, assumptions, and any limitations of the
results.

✔ Because the presentation is often circulated within an organization, it is critical to articulate the results
properly and position the findings in a way that is appropriate for the audience.

✔ As part of Phase 5, the team needs to determine if it succeeded or failed in its objectives. Many times
people do not want to admit to failing, but in this instance failure should not be considered as a true failure,
but rather as a failure of the data to accept or reject a given hypothesis adequately.
✔ This concept can be counterintuitive for those who have been told their whole careers not to fail. However,
the key is to remember that the team must be rigorous enough with the data to determine whether it will
prove or disprove the hypotheses outlined in Phase 1 (discovery).

✔ Sometimes teams have only done a superficial analysis, which is not robust enough to accept or reject a
hypothesis. Other times, teams perform very robust analysis and are searching for ways to show results,
even when results may not be there. It is important to strike a balance between these two extremes when
it comes to analyzing data and being pragmatic in terms of showing real-world results.

✔ When conducting this assessment, determine if the results are statistically significant and valid. If they are,
identify the aspects of the results that stand out and may provide salient findings when it comes time to
communicate them.

✔ If the results are not valid, think about adjustments that can be made to refine and iterate on the model to
make it valid. During this step, assess the results and identify which data points may have been surprising
and which were in line with the hypotheses that were developed in Phase 1.

✔ Comparing the actual results to the ideas formulated early on produces additional ideas and insights that
would have been missed if the team had not taken time to formulate initial hypotheses early in the process.

✔ By this time, the team should have determined which model or models address the analytical challenge in
the most appropriate way. In addition, the team should have ideas of some of the findings as a result of the
project. The best practice in this phase is to record all the findings and then select the three most
significant ones that can be shared with the stakeholders.

✔ In addition, the team needs to reflect on the implications of these findings and measure the business
value. Depending on what emerged as a result of the model, the team may need to spend time quantifying
the business impact of the results to help prepare for the presentation and demonstrate the value of the
findings.

✔ Now that the team has run the model, completed a thorough discovery phase, and learned a great deal
about the datasets, reflect on the project and consider what obstacles were in the project and what can be
improved in the future.

✔ Make recommendations for future work or improvements to existing processes, and consider what each of
the team members and stakeholders needs to fulfill her responsibilities. For instance, sponsors must
champion the project. Stakeholders must understand how the model affects their processes.

✔ For example, if the team has created a model to predict customer churn, the Marketing team must
understand how to use the churn model predictions in planning their interventions.

✔ Production engineers need to operationalize the work that has been done. In addition, this is the phase to
underscore the business benefits of the work and begin making the case to implement the logic into a live
production environment.

Phase 6 - Operationalize

✔ In the final phase, the team communicates the benefits of the project more broadly and sets up a pilot
project to deploy the work in a controlled way before broadening the work to a full enterprise or ecosystem
of users.

✔ Phase 6 represents the first time that most analytics teams approach deploying the new analytical
methods or models in a production environment. Rather than deploying these models immediately on a
wide-scale basis, the risk can be managed more effectively and the team can learn by undertaking a small
scope, pilot deployment before a wide-scale rollout.
✔ This approach enables the team to learn about the performance and related constraints of the model in a
production environment on a small scale and make adjustments before a full deployment.

✔ Be aware that this phase can bring in a new set of team members—usually the engineers responsible for
the production environment who have a new set of issues and concerns beyond those of the core project
team.

✔ This technical group needs to ensure that running the model fits smoothly into the production environment
and that the model can be integrated into related business processes.

✔ Part of the operationalizing phase includes creating a mechanism for performing ongoing monitoring of
model accuracy and, if accuracy degrades, finding ways to retrain the model.

✔ If feasible, design alerts for when the model is operating “out-of-bounds.” This includes situations when the
inputs are beyond the range that the model was trained on, which may cause the outputs of the model to
be inaccurate or invalid. If this begins to happen regularly, the model needs to be retrained on new data.
Although many roles represent many interests within a project, these interests usually overlap, and most of them
can be met with four main deliverables.

✔ Presentation for project sponsors: This contains high-level takeaways for executive level stakeholders,
with a few key messages to aid their decision-making process. Focus on clean, easy visuals for the
presenter to explain and for the viewer to grasp.

✔ Presentation for analysts, which describes business process changes and reporting changes. Fellow
data scientists will want the details and are comfortable with technical graphs such as Receiver Operating
Characteristic [ROC] curves, density plots, and histograms

✔ Code for technical people.

✔ Technical specifications of implementing the code.

✔ As a general rule, the more executive the audience, the more succinct the presentation needs to be. Most
executive sponsors attend many briefings in the course of a day or a week. Ensure that the presentation
gets to the point quickly and frames the results in terms of value to the sponsor’s organization.

✔ When presenting to other audiences with more quantitative backgrounds, focus more time on the
methodology and findings. In these instances, the team can be more expansive in describing the
outcomes, methodology, and analytical experiment with a peer group.

Common Analytics Interview Questions
No ratings yet
Common Analytics Interview Questions
4 pages
New Opportunities Intermediate Language Powerbook
78% (9)
New Opportunities Intermediate Language Powerbook
128 pages
Lesson 6 Data Life Cycle Part 2
No ratings yet
Lesson 6 Data Life Cycle Part 2
30 pages
LIFE CYCLE
No ratings yet
LIFE CYCLE
35 pages
Module_1B
No ratings yet
Module_1B
65 pages
Module I(Introduction Data Analytics Life Cycle) Part II (1)
No ratings yet
Module I(Introduction Data Analytics Life Cycle) Part II (1)
103 pages
W3 - DA Life Cycle
No ratings yet
W3 - DA Life Cycle
49 pages
Unit2-Data Science
No ratings yet
Unit2-Data Science
20 pages
Data Analytics Life Cycle
No ratings yet
Data Analytics Life Cycle
8 pages
Data Science Methodology
No ratings yet
Data Science Methodology
21 pages
Key Roles and Life Cycle
No ratings yet
Key Roles and Life Cycle
4 pages
Data Analytics I Unit Notes
No ratings yet
Data Analytics I Unit Notes
8 pages
data-analytics-lifecycle
No ratings yet
data-analytics-lifecycle
4 pages
_unit2 DATA SCIENCE
No ratings yet
_unit2 DATA SCIENCE
8 pages
Xii Analytical Approach
No ratings yet
Xii Analytical Approach
3 pages
Data Analytics Lifecycle
No ratings yet
Data Analytics Lifecycle
2 pages
Data Science Methodology
No ratings yet
Data Science Methodology
4 pages
Overview of Data Analytics Lifecycle: Unit 2
No ratings yet
Overview of Data Analytics Lifecycle: Unit 2
100 pages
Unit 1 - DSA
No ratings yet
Unit 1 - DSA
12 pages
Week 2 - Data Analytics Life Cycle
No ratings yet
Week 2 - Data Analytics Life Cycle
41 pages
ATW115 Slides Chp02
No ratings yet
ATW115 Slides Chp02
52 pages
Data Analytics Lifecycle
No ratings yet
Data Analytics Lifecycle
2 pages
Data Science Lifecycle
No ratings yet
Data Science Lifecycle
3 pages
Chapter 02 DataAnalyticsLifecycle
No ratings yet
Chapter 02 DataAnalyticsLifecycle
44 pages
3 - The Data Science Method
No ratings yet
3 - The Data Science Method
8 pages
COS10022 DSP Week02 Regressions
No ratings yet
COS10022 DSP Week02 Regressions
41 pages
Tugas Analitika Data (Yasa Hapipudin)
No ratings yet
Tugas Analitika Data (Yasa Hapipudin)
4 pages
d-3 (1)
No ratings yet
d-3 (1)
17 pages
CSCI946 w3_DataPrep
No ratings yet
CSCI946 w3_DataPrep
58 pages
Statistics for Data Science
No ratings yet
Statistics for Data Science
39 pages
2-Data Analytics Lifecycle(1) (2)
No ratings yet
2-Data Analytics Lifecycle(1) (2)
17 pages
BSR-Data Science
No ratings yet
BSR-Data Science
308 pages
What Is Data Anaysis
No ratings yet
What Is Data Anaysis
8 pages
Data Science Methodology
No ratings yet
Data Science Methodology
26 pages
Unit V
No ratings yet
Unit V
4 pages
Life Cycle of Data Analytics
No ratings yet
Life Cycle of Data Analytics
3 pages
Lec 2
No ratings yet
Lec 2
19 pages
2-Data Analytics Lifecycle
No ratings yet
2-Data Analytics Lifecycle
17 pages
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
No ratings yet
IBM Q1 Technical Marketing ASSET2 - Data Science Methodology-Best Practices For Successful Implementations Ov37176 PDF
6 pages
Ch1-Introduction to Data Analytics & LifeCycle
No ratings yet
Ch1-Introduction to Data Analytics & LifeCycle
26 pages
Secrets of Statistical Data Analysis and Management Science!
From Everand
Secrets of Statistical Data Analysis and Management Science!
Andrei Besedin
No ratings yet
DS&BDA Unit 3
No ratings yet
DS&BDA Unit 3
51 pages
DSUR_EA2352001010391_W3
No ratings yet
DSUR_EA2352001010391_W3
3 pages
DAV SEM 6
No ratings yet
DAV SEM 6
25 pages
HTTTTC- FINAL EXAM
No ratings yet
HTTTTC- FINAL EXAM
4 pages
CH05 Business Analytics Process and Data Exploration
No ratings yet
CH05 Business Analytics Process and Data Exploration
37 pages
DS Handout 2
No ratings yet
DS Handout 2
5 pages
Data Understanding and Prepration
100% (1)
Data Understanding and Prepration
10 pages
Team1_Data Science Methodology
No ratings yet
Team1_Data Science Methodology
39 pages
Big Data
No ratings yet
Big Data
4 pages
Business Analytics Process and Data Exploration
No ratings yet
Business Analytics Process and Data Exploration
38 pages
OC - Module 2 - DA Lifecycle 021312
No ratings yet
OC - Module 2 - DA Lifecycle 021312
33 pages
Data Science Tools Final
No ratings yet
Data Science Tools Final
11 pages
Data Analytics Part 3 (1)
No ratings yet
Data Analytics Part 3 (1)
54 pages
Keeping Up With Quants
No ratings yet
Keeping Up With Quants
13 pages
Chapter 1
No ratings yet
Chapter 1
41 pages
DATA ANALYTICS 1
No ratings yet
DATA ANALYTICS 1
13 pages
5 Data Analytics Life Cycle
No ratings yet
5 Data Analytics Life Cycle
18 pages
DSBDA UNIT3
No ratings yet
DSBDA UNIT3
21 pages
Module 6 - Data Science Methodology (Steps)
No ratings yet
Module 6 - Data Science Methodology (Steps)
19 pages
Using Forecasting Methodologies to Explore an Uncertain Future
From Everand
Using Forecasting Methodologies to Explore an Uncertain Future
James Poon
No ratings yet
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 3 - Cement and Concrete
No ratings yet
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 3 - Cement and Concrete
40 pages
2020 BUILDING TECHNOLOGY 1 Module 2 Lecture 2 - Waterproofing and Dampproofing
100% (1)
2020 BUILDING TECHNOLOGY 1 Module 2 Lecture 2 - Waterproofing and Dampproofing
56 pages
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 2a - Stone and Concrete Masonry
No ratings yet
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 2a - Stone and Concrete Masonry
41 pages
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 2b - Stone and Concrete Masonry
No ratings yet
2020 BUILDING TECHNOLOGY 1 Module 1 Lecture 2b - Stone and Concrete Masonry
43 pages
2020 BUILDING TECHNOLOGY 1 - Module 2 Lecture 1 Metals
No ratings yet
2020 BUILDING TECHNOLOGY 1 - Module 2 Lecture 1 Metals
96 pages
Data Science: Lesson 4
No ratings yet
Data Science: Lesson 4
8 pages
Chapter (2) Design and Implementation
No ratings yet
Chapter (2) Design and Implementation
48 pages
Teja
No ratings yet
Teja
8 pages
(Ebook) Perceiving in Depth, Volume 3: Other Mechanisms of Depth Perception by Ian P. Howard ISBN 9780199764167, 0199764166 download
No ratings yet
(Ebook) Perceiving in Depth, Volume 3: Other Mechanisms of Depth Perception by Ian P. Howard ISBN 9780199764167, 0199764166 download
59 pages
Break The Bank Service Manual (ICE) - 11-01-07 - A
No ratings yet
Break The Bank Service Manual (ICE) - 11-01-07 - A
16 pages
Instructions for Term End Examination-June 2025
No ratings yet
Instructions for Term End Examination-June 2025
3 pages
Liu Zhi wrd01 Interns
No ratings yet
Liu Zhi wrd01 Interns
2 pages
DWDM Introduction and DWDM Components: N. Mary, Sde (TX) RTTC, Hyderabad
No ratings yet
DWDM Introduction and DWDM Components: N. Mary, Sde (TX) RTTC, Hyderabad
63 pages
This Is A Sample Text. Insert Your Desired Text Here
No ratings yet
This Is A Sample Text. Insert Your Desired Text Here
10 pages
Industrial Fanless BOX PC PC-IBC Spec
No ratings yet
Industrial Fanless BOX PC PC-IBC Spec
3 pages
Sharda_11e_full_accessible_ppt_02
No ratings yet
Sharda_11e_full_accessible_ppt_02
34 pages
Assignment Day 4 - 25th June 2020: in The Community
No ratings yet
Assignment Day 4 - 25th June 2020: in The Community
4 pages
Eciv 588 - Reseach - Wilcken
No ratings yet
Eciv 588 - Reseach - Wilcken
9 pages
Sequential Practice
No ratings yet
Sequential Practice
6 pages
ModelMate Tutorial GEGN583-483enhanced
No ratings yet
ModelMate Tutorial GEGN583-483enhanced
9 pages
Black Box Model of A Function: Coventry University Facilty of Engineering & Computing
No ratings yet
Black Box Model of A Function: Coventry University Facilty of Engineering & Computing
17 pages
Personal Software Process: Mohammed Ahmed Ali
No ratings yet
Personal Software Process: Mohammed Ahmed Ali
16 pages
Advanced Results Processing: Workshop 8
No ratings yet
Advanced Results Processing: Workshop 8
26 pages
Open-Ended Tools
100% (5)
Open-Ended Tools
24 pages
WWW - Cartiaz.ro: No Questions Quiz 100 Answers
No ratings yet
WWW - Cartiaz.ro: No Questions Quiz 100 Answers
1 page
As 3648-1993 Specification and Methods of Test For Packaged Concrete Mixes
No ratings yet
As 3648-1993 Specification and Methods of Test For Packaged Concrete Mixes
7 pages
Katz AnalyticEdge 2020
No ratings yet
Katz AnalyticEdge 2020
13 pages
TV - Samsung Cl21z50mq-Chassis - KSBH-P-CB1J PDF
No ratings yet
TV - Samsung Cl21z50mq-Chassis - KSBH-P-CB1J PDF
53 pages
CH 7 Numerical Control
No ratings yet
CH 7 Numerical Control
72 pages
Lab-3 Functional and Non-Functional Requirements
No ratings yet
Lab-3 Functional and Non-Functional Requirements
3 pages
Frontend Engineer Techincal Test
No ratings yet
Frontend Engineer Techincal Test
2 pages
Infrastructure and Project Finance
No ratings yet
Infrastructure and Project Finance
37 pages
Expert System For Turbine Faults Diagnosis Based On Prolog Language
No ratings yet
Expert System For Turbine Faults Diagnosis Based On Prolog Language
15 pages
Practical Work in Geography Ch-6 Gis New
No ratings yet
Practical Work in Geography Ch-6 Gis New
44 pages
Project-proposal-Bike Sales and Inventory-System-Management
No ratings yet
Project-proposal-Bike Sales and Inventory-System-Management
6 pages

Data Science: Lesson 5

Uploaded by

Data Science: Lesson 5

Uploaded by

DATA ANALYTICS LIFE CYCLE

Model Planning - Data Exploration and Variable Selection

Model Planning – Model Selection

Phase 4 – Model Building

✔ Is the model sufficiently accurate to meet the goal?

✔ Does the model avoid intolerable mistakes?

✔ Will the kind of model chosen support the runtime requirements?

Commercial Tools: Free or Open-Source tools:

Phase 5 – Communicate the Results

✔ Code for technical people.

✔ Technical specifications of implementing the code.

You might also like