0% found this document useful (0 votes)
4 views

Assignment from Chapter 1

Data mining is the process of discovering patterns in data to make informed decisions, integrating statistics, AI, and machine learning. The book will utilize RapidMiner and OpenOffice Base and Calc software. The CRISP-DM framework consists of six steps: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment, each crucial for successful data mining.

Uploaded by

williamsbraxtonn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assignment from Chapter 1

Data mining is the process of discovering patterns in data to make informed decisions, integrating statistics, AI, and machine learning. The book will utilize RapidMiner and OpenOffice Base and Calc software. The CRISP-DM framework consists of six steps: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment, each crucial for successful data mining.

Uploaded by

williamsbraxtonn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Assignment from Chapter 1

1- What is Data Mining?


Data Mining allows individuals to locate and interpret those patterns, helping them make
better informed decisions and better serve their customers. Data mining is also the fusion
of applied statistics, logic, artificial intelligence, machine learning and data management
systems.

2- Which software will be used in this book?


RapidMiner and OpenOffice Base and Calc

3- Describe briefly the steps of the CRISP-DM, the CRoss-Industry Standard Process for
Data Mining.

Step 1 - Business Understanding. Crucial step in successful data mining. Understanding


what you’re looking for first is the key before any mining can be conducted.

Step 2 - Data Understanding. Preparatory activity. Conducting research in this data phase
is vital before any data can be mined.

Step 3 – Data Preparation. Involves a multitude of activities. These may include joining
two or more data sets together, reducing data sets to only those variables that are
interesting in each data mining exercise, scrubbing data cleaning of anomalies such as
outlier observations or missing data, or re-formatting data for consistency purposes.

Step 4 - Modeling. A model is a computerized representation of real-world observations.


Models are the application of algorithms to seek out, identify, and display any patterns or
messages in your data. There are two basic kinds or types of models in data mining: those
that classify and those that predict.

Step 5 – Evaluation. Evaluation can be accomplished using several techniques, both


mathematical and logical in nature. The evaluation phase is to specifically help you
determine how valuable your model is, and what you might want to do with it.

Step 6 – Deployment. Deployment involves setting up automating your model, meeting


with consumers of your model’s outputs, integrating with existing management or
operational information systems, feeding new learning from model use back into the
model to improve its accuracy and performance, and monitoring and measuring the
outcomes of model use.

You might also like