Data Mining and Predictive Analytics
Data Mining and Predictive Analytics
1. For each of the following meetings, explain which phase of the CRISP-DM process is
represented:
a. Managers want to know by next week whether deployment will take place. Therefore,
analysts meet to discuss how useful and accurate their model is.
Evaluation Phase - Data mining analysts analyze whether the model and approach
employed in the first step accomplish the business goals indicated in the evaluation phase.
b. The data mining project manager meets with the data warehousing manager to discuss
how the data will be collected.
Understanding Phase - During the Company Understanding Phase, the primary goal
of business is to evaluate. As a result of the meeting, the data mining specialist appeared
to have persuaded the vice president of marketing to allow data mining on the customer
relationship management system.
d. The data mining project manager meets with the production line supervisor to discuss
the implementation of changes and improvements.
Modeling Phase - The analysts get together to discuss, choose, and implement
appropriate modeling methodologies for use throughout the Modeling Phase.
2. Discuss the need for human direction of data mining. Describe the possible
consequences of relying on completely automated data analysis tools.
3. CRISP-DM is not the only standard process for data mining. Research an alternative
methodology (ex: Sample, Explore, Modify, Model, and Assess or SEMMA, from SAS
Institute). Discuss the similarities and differences of the researched methodology with
CRISP-DM.
For the data science life cycle, the CRISP-DM (Cross Industry Standard Process for
Data Mining) is a six-phase process model. It acts as a collection of principles for planning,
organizing, and carrying out a data science (or machine learning) project. While Fayyad,
Piatetsky-Shapiro, and Smyth proposed the foundations of structured data mining
approaches, they were originally tied to Knowledge Discovery in Databases. KDD is a
conceptual process model of computational ideas and methods for data extraction. Data
mining is a specific phase in the KDD approach to knowledge discovery. As a result, KDD
has the advantage of taking into account data storage and access, algorithm scaling,
result interpretation and visualization, and human-computer interaction through its nine
primary phases. With the introduction of KDD, the line between data mining and data
analytics became even obvious.
Part II
1. Discuss briefly the difference between Data Mining and Predictive Analytics.
Predictive analytics is the process of extracting data from large databases in order to
make predictions and projections about future events. Data mining is the process of
identifying important patterns and trends in large data collections, whereas predictive
analytics is the process of extracting data from large databases in order to make
predictions and projections about future events. On the other hand, predictive analytics
and data mining both use data models to generate predictions about upcoming events.
Although there are significant distinctions between the two, they are frequently used to
explain how data is handled.
Fallacy 1. There are data mining tools that we can turn loose on our data repositories, and
find answers to our problems.
According to popular assumption, there are no automatic data mining solutions that
can answer your problems while you wait. Data mining, on the other hand, is a method.
CRISP-DM can be used to integrate data mining into a larger commercial or research
plan.
Fallacy 2. The data mining process is autonomous, requiring little or no human oversight.
The ability of the embedded control mechanism to make independent data mining
judgments. On the other hand, data mining isn't a magic formula. Without skilled human
supervision, using data mining technology blindly can only supply you with an inaccurate
answer to an incorrect query applied to the wrong sort of data.
According to the instructional material, return rates vary depending on start-up costs,
analytical manpower costs, and data warehouse preparation expenditures, among other
factors.
Fallacy 4. Data mining software packages are intuitive and easy to use.
Fallacy 5. Data mining will identify the causes of our business or research problems.
Because data is merely a collection of programs that you want the system to
recognize, it usually has to be preprocessed.
Models have long played a crucial role in science, and they are still used to evaluate
hypotheses and forecast data today. Scientists are frequently inaccurate because they do
not have all of the information. Businesses can use predictive models to find, keep, and
expand their most profitable clients. It enhances the efficiency of the business. To forecast
inventory and manage resources, many businesses use predictive models. To establish
ticket prices, airlines utilize predictive analytics. It's very beneficial for demand forecasting,
labor planning, and customer attrition analysis, as well as competitor research in depth. It
anticipates external influences that may have an impact on your workflow. It's Fleet
maintenance time. Financial risks are highlighted, as are credit models.
4.1. Product
It boosts lead generation by providing the information needed to optimize
advertising campaigns and target the most profitable customers. More revenue and a
greater return on investment result from better leads. Marketing analytics provides insight
into customer behavior and preferences.
4.2. Price
The product's perceived value determines the pricing strategy. It's logical to
presume that the price was determined based on what the competitors was charging. It's
used to determine how pricing decisions affect the total business, examine the profitability
of various price points, and improve a company's pricing strategy to maximize revenue.
4.3. Place
Place encompasses many locations where products is made, viewed in ads,
distributed, and sold. They can better estimate where they can position their business with
the use of data.
4.4. Promotion
It boosts lead generation by providing the information needed to optimize
advertising campaigns and target the most profitable customers. More revenue and a
greater return on investment result from better leads. Marketing analytics provides insight
into customer behavior and preferences.
4.6. People
People aren't merely the people to whom you sell and advertise. Personnel,
salespeople, customer service representatives, and anybody else involved in the
marketing and sales operations are all included. For instance, it may play a role in
minimizing costly staff turnover. However, only a few businesses are capable of
developing human resource forecasting models.
4.7. Processes
The delivery of a product or service to a consumer is referred to as this P.
Function, activities, tasks, and processes must all be outlined in maps. The goal of
implementing predictive analytics is to keep processes running smoothly and efficiently.
5. Relate a company’s competitive data analytics culture and proper implementation of Big
Data strategy. Explain the possible outcome given this situation. Will this have a significant
impact to business performance?
Big data refers to data volumes that are so huge and complicated that typical data
processing technologies become ineffective. To handle such massive amounts of data,
new technology is required. Big Data, according to the internet, has recently piqued the
interest of academics and practitioners due to its potential to provide beneficial insights for
improved decision-making. Many firms are turning to Big Data Analytics to gain valuable
insights from massive amounts of data. AIG's chief science officer, the most difficult
aspect of transitioning from a knowing culture to a learning culture from a culture that
relies heavily on heuristics in decision making to a culture that is much more objective and
data driven, and embraces the power of data and technology is not the cost. At initially, it
is mostly determined by imagination and inertia. Data analysis not only improves
productivity, but it also helps uncover new business prospects that could otherwise go
unnoticed, such as underserved client categories. As a result, the potential for profit and
growth expands, and the focus shifts to intelligence. Big data can be utilized to minimize
operating costs and better manage resources. The data acquired can be utilized to
improve and change corporate processes, lowering costs and improving profits. In
addition, Big Data analysis makes the process of cutting waste and enhancing efficiency
much easier.