Data Analytics Key Notes
Data Analytics Key Notes
Data Mining
Data mining also called data or knowledge discovery means
analyzing data from different perspectives and summarizing it
into useful information – information that we can use to make
important decisions. It is the technique of exploring, analyzing,
and detecting patterns in large amounts of data.
1. Business Understanding
Whenever any requirement occurs, firstly we need to determine
the business objective, assess the situation, determine data mining
goals and then produce the project plan as per the
requirement. Business objectives are defined in this phase.
2. Data Exploration
For the further process, we need to gather initial data,
describe and explore data and lastly verify data quality to
ensure it contains the data we require. Data collected from the
various sources is described in terms of its application and the
need for the project in this phase. This is also known as data
exploration. This is necessary to verify the quality of data
collected.
3. Data Preparation
From the data collected in the last step, we need to select data
as per the need, clean it, construct it to get useful information
and then integrate it all. Finally, we need to format the data to
get the appropriate data. Data is selected, cleaned, and
integrated into the format finalized for the analysis in this
phase.
4. Data Modeling
After gathering the data, we perform data modeling on it. For
this, we need to select a modeling technique, generate test design,
build a model and assess the model built. The data model is build
to analyze relationships between various selected objects in
the data. Test cases are built for assessing the model and
model is tested and implemented on the data in this phase.
Descriptive Modeling: It uncovers shared similarities or groupings in
historical data to determine reasons behind success or failure, such as
categorizing customers by product preferences or sentiment.
Predictive Modeling: This modeling goes deeper to classify events in
the future or estimate unknown outcomes – for example, using credit
scoring to determine an individual's likelihood of repaying a loan.
Predictive modeling also helps uncover insights for things like
customer churn, campaign response or credit defaults.
5. Data Evaluation
Here, we evaluate the results from the last step, review the
scope of error, and determine the next steps to perform. We
evaluate the results of the test cases and review the scope of
errors in this phase.