Chapter 2 - Chapter 3 - Chia Jing Xian
Chapter 2 - Chapter 3 - Chia Jing Xian
Chapter 2 - Chapter 3 - Chia Jing Xian
5. Evaluate the main issues that are affecting companies to purchase an ETL tool.
ELT tool have affecting on data transformation tools are expensive, long learning
curve and it is difficult to measure it purpose.
6. Review the direct and indirect benefits of implementing a data warehouse.
Direct benefits
Data warehouse allows end user used to extensive analysis.
Can allow to gather the analysis view of the data
Can have better and more timely information
Can increase system performance
Simplification of data access
Indirect benefits
Increase business knowledge
Increase customer service and satisfaction
Facilitate decision making
Reforming business processes
7. Highlight and explain four best practices reflecting towards the implementation of
data warehousing for an organization.
The project must fit with corporate strategy
The project must be managed by both IT and business professionals (a business-
supplier relationship must be developed)
Only cleansed and high-quality data that can load into data warehouse
It is important to manage user expectation
Chapter 3: Data Mining Part 1
1. Analyse the reasoning behind the popularity of Data Mining in the current business
environment.
Data mining availability of quality data on customer. Data will be integration into data
warehouse. Data mining will increase in the data processing and storage capabilities
and low cost. Data mining is the core to ensure successful analytic initiative. The
generated information can be used into BI and advanced analytics.
2. Data mining is all about explaining the past and predicting the future for analysis
purposes. Outline and explain the characteristics of Data Mining.
Data mining is practice searching hidden, valid and potentially useful pattern and
relationship in a big data set. And discovery unknown relationship between the
variables.
3. In data mining, data is related to a collection of facts usually obtained as the results of
experiences, observations, or experiments. Explain the TWO (2) data types involved
in data mining.
a. Categorical data: showing the label of multiple classes to classification the
variable into a group. It can classification into nominal data ordinal data.
i. Nominal data – is measure simple code as yes and no
ii. Ordinal data- is assigned ranking among them
b. Numerical data: is showing numeric value, such as interval data and ratio data.
i. Interval data – is variable that can be measured on interval scales as
temperatures.
ii. Ratio data – is measure variables commonly such as length, time.
4. Evaluate the usage of data mining applications within the following industries /
domain:
a. Customer relationship management – can maximize return on marketing
campaigns, improve customer retention, maximize customer value, and
identify and treat most valued customer.
b. Banking and financial institution – automate the loan application process,
detected fraudulent transaction, and maximize customer value
c. Retailing and logistics – upgrade store level in different location, improve the
store sales promotion, improve logistics on predicting seasonal effects and
minimize lost in the limited shelf life.
5. Evaluate any FIVE (5) common mistakes involved in Data mining.
Selecting wrong problem for data mining.
Sponsor will think data mining is what can do or cannot do.
Did not have enough time for data acquisition, selection, and preparation.
Looking at result not in individual record.
Keeping track of the data mining procedure and result will be sloppy.