100% found this document useful (1 vote)
932 views162 pages

DSS TestBank

Uploaded by

Areej Almalki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
932 views162 pages

DSS TestBank

Uploaded by

Areej Almalki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 162

Exam

Name___________________________________

1. In the Isle of Capri case, the only capability added by the new software was increased processing speed of
processing reports.
Answer: True False
Diff: 2 Page Ref: 109

2. The "islands of data" problem in the 1980s describes the phenomenon of unconnected data being stored in
numerous locations within an organization.
Answer: True False
Diff: 2 Page Ref: 112

3. Subject oriented databases for data warehousing are organized by detailed subjects such as disk drives,
computers, and networks.
Answer: True False
Diff: 2 Page Ref: 113

4. Data warehouses are subsets of data marts.


Answer: True False
Diff: 1 Page Ref: 114

5. One way an operational data store differs from a data warehouse is the recency of their data.
Answer: True False
Diff: 2 Page Ref: 114-115

6. Organizations seldom devote a lot of effort to creating metadata because it is not important for the effective
use of data warehouses.
Answer: True False
Diff: 2 Page Ref: 117

7. Without middleware, different BI programs cannot easily connect to the data warehouse.
Answer: True False
Diff: 2 Page Ref: 120

8. Two-tier data warehouse/BI infrastructures offer organizations more flexibility but cost more than three-tier
ones.
Answer: True False
Diff: 2 Page Ref: 121

9. Moving the data into a data warehouse is usually the easiest part of its creation.
Answer: True False
Diff: 2 Page Ref: 123
10. The hub-and-spoke data warehouse model uses a centralized warehouse feeding dependent data marts.
Answer: True False
Diff: 2 Page Ref: 123

11. Because of performance and data quality issues, most experts agree that the federated architecture should
supplement data warehouses, not replace them.
Answer: True False
Diff: 2 Page Ref: 125

12. Bill Inmon advocates the data mart bus architecture whereas Ralph Kimball promotes the hub-and-spoke
architecture, a data mart bus architecture with conformed dimensions.
Answer: True False
Diff: 2 Page Ref: 126

13. The ETL process in data warehousing usually takes up a small portion of the time in a data-centric project.
Answer: True False
Diff: 3 Page Ref: 130

14. In the Starwood Hotels case, up-to-date data and faster reporting helped hotel managers better manage their
occupancy rates.
Answer: True False
Diff: 1 Page Ref: 136

15. Large companies, especially those with revenue upwards of $500 million consistently reap substantial cost
savings through the use of hosted data warehouses.
Answer: True False
Diff: 2 Page Ref: 138

16. OLTP systems are designed to handle ad hoc analysis and complex queries that deal with many data items.
Answer: True False
Diff: 2 Page Ref: 140

17. The data warehousing maturity model consists of six stages: prenatal, infant, child, teenager, adult, and sage.
Answer: True False
Diff: 2 Page Ref: 143

18. A well-designed data warehouse means that user requirements do not have to change as business needs
change.
Answer: True False
Diff: 2 Page Ref: 147

19. Data warehouse administrators (DWAs) do not need strong business insight since they only handle the
technical aspect of the infrastructure.
Answer: True False
Diff: 2 Page Ref: 152

20. Because the recession has raised interest in low-cost open source software, it is now set to replace traditional
enterprise software.
Answer: True False
Diff: 2 Page Ref: 153

21. The "single version of the truth" embodied in a data warehouse such as Capri Casinos' means all of the
following EXCEPT
A) decision makers get to see the same results to queries.
B) decision makers have the same data available to support their decisions.
C) decision makers have unfettered access to all data in the warehouse.
D) decision makers get to use more dependable data for their decisions.
Answer: C
Diff: 3 Page Ref: 111

22. Operational or transaction databases are product oriented, handling transactions that update the database. In
contrast, data warehouses are
A) subject-oriented and nonvolatile.
B) subject-oriented and volatile.
C) product-oriented and nonvolatile.
D) product-oriented and volatile.
Answer: A
Diff: 3 Page Ref: 113-114

23. Which kind of data warehouse is created separately from the enterprise data warehouse by a department
and not reliant on it for updates?
A) sectional data mart
B) volatile data mart
C) public data mart
D) independent data mart
Answer: D
Diff: 2 Page Ref: 114

24. All of the following statements about metadata are true EXCEPT
A) metadata gives context to reported data.
B) for most organizations, data warehouse metadata are an unnecessary expense.
C) metadata helps to describe the meaning and structure of data.
D) there may be ethical issues involved in the creation of metadata.
Answer: B
Diff: 2 Page Ref: 117

25. A Web client that connects to a Web server, which is in turn connected to a BI application server, is reflective
of a
A) three tier architecture.
B) one tier architecture.
C) two tier architecture.
D) four tier architecture.
Answer: A
Diff: 2 Page Ref: 121-122

26. Which of the following BEST enables a data warehouse to handle complex queries and scale up to handle
many more requests?
A) Microsoft Windows
B) a larger IT staff
C) use of the web by users as a front-end
D) parallel processing
Answer: D
Diff: 3 Page Ref: 122

27. Which data warehouse architecture uses metadata from existing data warehouses to create a hybrid logical
data warehouse comprised of data from the other warehouses?
A) centralized data warehouse architecture
B) hub-and-spoke data warehouse architecture
C) independent data marts architecture
D) federated architecture
Answer: D
Diff: 3 Page Ref: 124

28. Which data warehouse architecture uses a normalized relational warehouse that feeds multiple data marts?
A) federated architecture
B) hub-and-spoke data warehouse architecture
C) independent data marts architecture
D) centralized data warehouse architecture
Answer: B
Diff: 3 Page Ref: 124

29. Which approach to data warehouse integration focuses more on sharing process functionality than data
across systems?
A) enterprise function integration
B) enterprise information integration
C) extraction, transformation, and load
D) enterprise application integration
Answer: D
Diff: 3 Page Ref: 129

30. In which stage of extraction, transformation, and load (ETL) into a data warehouse are data aggregated?
A) extraction
B) load
C) transformation
D) cleanse
Answer: C
Diff: 3 Page Ref: 130
31. In which stage of extraction, transformation, and load (ETL) into a data warehouse are anomalies detected
and corrected?
A) load
B) transformation
C) cleanse
D) extraction
Answer: C
Diff: 3 Page Ref: 130

32. Data warehouses provide direct and indirect benefits to using organizations. Which of the following is an
indirect benefit of data warehouses?
A) extensive new analyses performed by users
B) simplified access to data
C) better and more timely information
D) improved customer service
Answer: D
Diff: 3 Page Ref: 132

33. All of the following are benefits of hosted data warehouses EXCEPT
A) greater control of data.
B) frees up in-house systems.
C) better quality hardware.
D) smaller upfront investment.
Answer: A
Diff: 2 Page Ref: 138

34. When representing data in a data warehouse, using several dimension tables that are each connected only to
a fact table means you are using which warehouse structure?
A) relational schema
B) dimensional schema
C) star schema
D) snowflake schema
Answer: C
Diff: 3 Page Ref: 138-139

35. When querying a dimensional database, a user went from summarized data to its underlying details. The
function that served this purpose is
A) slice.
B) roll-up.
C) drill down.
D) dice.
Answer: C
Diff: 3 Page Ref: 140-141

36. Which of the following online analytical processing (OLAP) technologies does NOT require the
precomputation and storage of information?
A) MOLAP
B) SQL
C) HOLAP
D) ROLAP
Answer: D
Diff: 2 Page Ref: 141-142

37. Active data warehousing can be used to support the highest level of decision making sophistication and
power. The major feature that enables this in relation to handling the data is
A) nature of the data.
B) speed of data transfer.
C) country of (data) origin.
D) source of the data.
Answer: B
Diff: 2 Page Ref: 147

38. Which of the following statements is more descriptive of active data warehouses in contrast with traditional
data warehouses?
A) large numbers of users, including operational staffs
B) restrictive reporting with daily and weekly data currency
C) detailed data available for strategic use only
D) strategic decisions whose impacts are hard to measure
Answer: A
Diff: 3 Page Ref: 151

39. How does the use of cloud computing affect the scalability of a data warehouse?
A) Hardware resources are dynamically allocated as use increases.
B) Cloud vendors are mostly based overseas where the cost of labor is low.
C) Cloud computing has little effect on a data warehouse's scalability.
D) Cloud computing vendors bring as much hardware as needed to users' offices.
Answer: A
Diff: 3 Page Ref: 153

40. All of the following are true about in-database processing technology EXCEPT
A) it pushes the algorithms to where the data is.
B) it is the same as in-memory storage technology.
C) it is often used for apps like credit card fraud detection and investment risk management.
D) it makes the response to queries much faster than conventional databases.
Answer: B
Diff: 3 Page Ref: 155

41. With ________ data flows, managers can view the current state of their businesses and quickly identify
problems.
Answer: real-time
Diff: 2 Page Ref: 111

42. In ________ oriented data warehousing, operational databases are tuned to handle transactions that update
the database.
Answer: product
Diff: 2 Page Ref: 113

43. The three main types of data warehouses are data marts, operational ________, and enterprise data
warehouses.
Answer: data stores
Diff: 2 Page Ref: 114

44. ________ describe the structure and meaning of the data, contributing to their effective use.
Answer: Metadata
Diff: 1 Page Ref: 115-117

45. Most data warehouses are built using ________ database management systems to control and manage the
data.
Answer: relational
Diff: 2 Page Ref: 122

46. A(n) ________ architecture is used to build a scalable and maintainable infrastructure that includes a
centralized data warehouse and several dependent data marts.
Answer: hub-and-spoke
Diff: 2 Page Ref: 123

47. The ________ data warehouse architecture involves integrating disparate systems and analytical resources
from multiple sources to meet changing needs or business conditions.
Answer: federated
Diff: 2 Page Ref: 125

48. Data ________ comprises data access, data federation, and change capture.
Answer: integration
Diff: 3 Page Ref: 128

49. ________ is a mechanism that integrates application functionality and shares functionality (rather than data)
across systems, thereby enabling flexibility and reuse.
Answer: Enterprise application integration (EAI)
Diff: 3 Page Ref: 129

50. ________ is a mechanism for pulling data from source systems to satisfy a request for information. It is an
evolving tool space that promises real-time data integration from a variety of sources, such as relational
databases, Web services, and multidimensional databases.
Answer: Enterprise information integration (EII)
Diff: 3 Page Ref: 129

51. Performing extensive ________ to move data to the data warehouse may be a sign of poorly managed data
and a fundamental lack of a coherent data management strategy.
Answer: extraction, transformation, and load (ETL)
Diff: 3 Page Ref: 131

52. The ________ Model, also known as the EDW approach, emphasizes top-down development, employing
established database development methodologies and tools, such as entity-relationship diagrams (ERD), and
an adjustment of the spiral development approach.
Answer: Inmon
Diff: 2 Page Ref: 134

53. The ________ Model, also known as the data mart approach, is a "plan big, build small" approach. A data
mart is a subject-oriented or department-oriented data warehouse. It is a scaled-down version of a data
warehouse that focuses on the requests of a specific department, such as marketing or sales.
Answer: Kimball
Diff: 2 Page Ref: 134

54. ________ modeling is a retrieval-based system that supports high-volume query access.
Answer: Dimensional
Diff: 2 Page Ref: 138

55. Online ________ is arguably the most commonly used data analysis technique in data warehouses.
Answer: analytical processing
Diff: 1 Page Ref: 139

56. Online ________ is a term used for a transaction system that is primarily responsible for capturing and
storing data related to day-to-day business functions such as ERP, CRM, SCM, and point of sale.
Answer: transaction processing
Diff: 2 Page Ref: 140

57. In the Michigan State Agencies case, the approach used was a(n) ________ one, instead of developing
separate BI/DW platforms for each business area or state agency.
Answer: enterprise
Diff: 2 Page Ref: 146

58. The role responsible for successful administration and management of a data warehouse is the ________,
who should be familiar with high-performance software, hardware, and networking technologies, and also
possesses solid business insight.
Answer: data warehouse administrator (DWA)
Diff: 2 Page Ref: 152

59. ________, or "The Extended ASP Model," is a creative way of deploying information system applications
where the provider licenses its applications to customers for use as a service on demand (usually over the
Internet)
Answer: SaaS (software as a service)
Diff: 2 Page Ref: 153
60. ________ (also called in-database analytics) refers to the integration of the algorithmic extent of data
analytics into data warehouse.
Answer: In-database processing
Diff: 2 Page Ref: 155

61. What is the definition of a data warehouse (DW) in simple terms?


Answer: In simple terms, a data warehouse (DW) is a pool of data produced to support decision making; it is
also a repository of current and historical data of potential interest to managers throughout the
organization.
Diff: 2 Page Ref: 111

62. A common way of introducing data warehousing is to refer to its fundamental characteristics. Describe three
characteristics of data warehousing.
Answer: • Subject oriented. Data are organized by detailed subject, such as sales, products, or customers,
containing only information relevant for decision support.
• Integrated. Integration is closely related to subject orientation. Data warehouses must place data
from different sources into a consistent format. To do so, they must deal with naming conflicts and
discrepancies among units of measure. A data warehouse is presumed to be totally integrated.
• Time variant (time series). A warehouse maintains historical data. The data do not necessarily
provide current status (except in real-time systems). They detect trends, deviations, and long-term
relationships for forecasting and comparisons, leading to decision making. Every data warehouse has
a temporal quality. Time is the one important dimension that all data warehouses must support. Data
for analysis from multiple sources contains multiple time points (e.g., daily, weekly, monthly views).
• Nonvolatile. After data are entered into a data warehouse, users cannot change or update the
data. Obsolete data are discarded, and changes are recorded as new data.
• Web based. Data warehouses are typically designed to provide an efficient computing
environment for Web-based applications.
• Relational/multidimensional. A data warehouse uses either a relational structure or a
multidimensional structure. A recent survey on multidimensional structures can be found in Romero
and Abelló (2009).
• Client/server. A data warehouse uses the client/server architecture to provide easy access for end
users.
• Real time. Newer data warehouses provide real-time, or active, data-access and analysis
capabilities (see Basu, 2003; and Bonde and Kuckuk, 2004).
• Include metadata. A data warehouse contains metadata (data about data) about how the data are
organized and how to effectively use them.
Diff: 3 Page Ref: 113-114

63. What is the definition of a data mart?


Answer: A data mart is a subset of a data warehouse, typically consisting of a single subject area (e.g.,
marketing, operations). Whereas a data warehouse combines databases across an entire enterprise, a
data mart is usually smaller and focuses on a particular subject or department.
Diff: 2 Page Ref: 114

64. Mehra (2005) indicated that few organizations really understand metadata, and fewer understand how to
design and implement a metadata strategy. How would you describe metadata?
Answer: Metadata are data about data. Metadata describe the structure of and some meaning about data,
thereby contributing to their effective or ineffective use.
Diff: 2 Page Ref: 115-117
65. According to Kassam (2002), business metadata comprise information that increases our understanding of
traditional (i.e., structured) data. What is the primary purpose of metadata?
Answer: The primary purpose of metadata should be to provide context to the reported data; that is, it provides
enriching information that leads to the creation of knowledge.
Diff: 2 Page Ref: 117

66. In the MultiCare case, how was data warehousing able to reduce septicemia mortality rates in MultiCare
hospitals?
Answer: • The Adaptive Data WarehouseTM organized and simplified data from multiple data sources across
the continuum of care. It became the single source of truth required to see care improvement
opportunities and to measure change, integrated teams consisting of clinicians, technologists, analysts,
and quality personnel were essential for accelerating MultiCare's efforts to reduce septicemia
mortality.
• Together the collaborative effort addressed three key bodies of work–standard of care definition,
early identification, and efficient delivery of defined-care standard.
Diff: 3 Page Ref: 118

67. Briefly describe four major components of the data warehousing process.
Answer: • Data sources. Data are sourced from multiple independent operational "legacy" systems and
possibly from external data providers (such as the U.S. Census).
Data may also come from an OLTP or ERP system.
• Data extraction and transformation. Data are extracted and properly transformed using custom-
written or commercial ETL software.
• Data loading. Data are loaded into a staging area, where they are transformed and cleansed. The
data are then ready to load into the data warehouse and/or data marts.
• Comprehensive database. Essentially, this is the EDW to support all decision analysis by
providing relevant summarized and detailed information originating from many different sources.
• Metadata. Metadata include software programs about data and rules for organizing data
summaries that are easy to index and search, especially with Web tools.
• Middleware tools. Middleware tools enable access to the data warehouse. There are many front-
end applications that business users can use to interact with data stored in the data repositories,
including data mining, OLAP, reporting tools, and data visualization tools.
Diff: 2 Page Ref: 119-120

68. There are several basic information system architectures that can be used for data warehousing. What are
they?
Answer: Generally speaking, these architectures are commonly called client/server or n-tier architectures, of
which two-tier and three-tier architectures are the most common, but sometimes there is simply one
tier.
Diff: 2 Page Ref: 120-121

69. More data, coming in faster and requiring immediate conversion into decisions, means that organizations
are confronting the need for real-time data warehousing (RDW). How would you define real-time data
warehousing?
Answer: Real-time data warehousing, also known as active data warehousing (ADW), is the process of loading
and providing data via the data warehouse as they become available.
Diff: 2 Page Ref: 147
70. Mention briefly some of the recently popularized concepts and technologies that will play a significant role
in defining the future of data warehousing.
Answer: • Sourcing (mechanisms for acquisition of data from diverse and dispersed sources):
- Web, social media, and Big Data
- Open source software
- SaaS (software as a service)
- Cloud computing
• Infrastructure (architectural–hardware and software–enhancements):
- Columnar (a new way to store and access data in the database)
- Real-time data warehousing
- Data warehouse appliances (all-in-one solutions to DW)
- Data management technologies and practices
- In-database processing technology (putting the algorithms where the data is)
- In-memory storage technology (moving the data in the memory for faster processing)
- New database management systems
- Advanced analytics
Diff: 3 Page Ref: 153-156
1. FALSE

2. TRUE

3. FALSE

4. FALSE

5. TRUE

6. FALSE

7. TRUE

8. FALSE

9. FALSE

10. TRUE

11. TRUE

12. FALSE

13. FALSE

14. TRUE

15. FALSE

16. FALSE

17. TRUE

18. FALSE

19. FALSE

20. FALSE

21. C

22. A

23. D

24. B

25. A

26. D
27. D

28. B

29. D

30. C

31. C

32. D

33. A

34. C

35. C

36. D

37. B

38. A

39. A

40. B

41. real-time

42. product

43. data stores

44. Metadata

45. relational

46. hub-and-spoke

47. federated

48. integration

49. Enterprise application integration (EAI)

50. Enterprise information integration (EII)

51. extraction, transformation, and load (ETL)

52. Inmon
53. Kimball

54. Dimensional

55. analytical processing

56. transaction processing

57. enterprise

58. data warehouse administrator (DWA)

59. SaaS (software as a service)

60. In-database processing

61. In simple terms, a data warehouse (DW) is a pool of data produced to support decision making; it is also a
repository of current and historical data of potential interest to managers throughout the organization.

62. • Subject oriented. Data are organized by detailed subject, such as sales, products, or customers, containing only
information relevant for decision support.
• Integrated. Integration is closely related to subject orientation. Data warehouses must place data from different
sources into a consistent format. To do so, they must deal with naming conflicts and discrepancies among units of
measure. A data warehouse is presumed to be totally integrated.
• Time variant (time series). A warehouse maintains historical data. The data do not necessarily provide current
status (except in real-time systems). They detect trends, deviations, and long-term relationships for forecasting and
comparisons, leading to decision making. Every data warehouse has a temporal quality. Time is the one important
dimension that all data warehouses must support. Data for analysis from multiple sources contains multiple time
points (e.g., daily, weekly, monthly views).
• Nonvolatile. After data are entered into a data warehouse, users cannot change or update the data. Obsolete
data are discarded, and changes are recorded as new data.
• Web based. Data warehouses are typically designed to provide an efficient computing environment for Web-
based applications.
• Relational/multidimensional. A data warehouse uses either a relational structure or a multidimensional
structure. A recent survey on multidimensional structures can be found in Romero and Abelló (2009).
• Client/server. A data warehouse uses the client/server architecture to provide easy access for end users.
• Real time. Newer data warehouses provide real-time, or active, data-access and analysis capabilities (see Basu,
2003; and Bonde and Kuckuk, 2004).
• Include metadata. A data warehouse contains metadata (data about data) about how the data are organized
and how to effectively use them.

63. A data mart is a subset of a data warehouse, typically consisting of a single subject area (e.g., marketing,
operations). Whereas a data warehouse combines databases across an entire enterprise, a data mart is usually
smaller and focuses on a particular subject or department.

64. Metadata are data about data. Metadata describe the structure of and some meaning about data, thereby
contributing to their effective or ineffective use.

65. The primary purpose of metadata should be to provide context to the reported data; that is, it provides enriching
information that leads to the creation of knowledge.
66. • The Adaptive Data WarehouseTM organized and simplified data from multiple data sources across the
continuum of care. It became the single source of truth required to see care improvement opportunities and to
measure change, integrated teams consisting of clinicians, technologists, analysts, and quality personnel were
essential for accelerating MultiCare's efforts to reduce septicemia mortality.
• Together the collaborative effort addressed three key bodies of work–standard of care definition, early
identification, and efficient delivery of defined-care standard.

67. • Data sources. Data are sourced from multiple independent operational "legacy" systems and possibly from
external data providers (such as the U.S. Census).
Data may also come from an OLTP or ERP system.
• Data extraction and transformation. Data are extracted and properly transformed using custom-written or
commercial ETL software.
• Data loading. Data are loaded into a staging area, where they are transformed and cleansed. The data are then
ready to load into the data warehouse and/or data marts.
• Comprehensive database. Essentially, this is the EDW to support all decision analysis by providing relevant
summarized and detailed information originating from many different sources.
• Metadata. Metadata include software programs about data and rules for organizing data summaries that are
easy to index and search, especially with Web tools.
• Middleware tools. Middleware tools enable access to the data warehouse. There are many front-end
applications that business users can use to interact with data stored in the data repositories, including data mining,
OLAP, reporting tools, and data visualization tools.

68. Generally speaking, these architectures are commonly called client/server or n-tier architectures, of which two-tier
and three-tier architectures are the most common, but sometimes there is simply one tier.

69. Real-time data warehousing, also known as active data warehousing (ADW), is the process of loading and
providing data via the data warehouse as they become available.

70. • Sourcing (mechanisms for acquisition of data from diverse and dispersed sources):
- Web, social media, and Big Data
- Open source software
- SaaS (software as a service)
- Cloud computing
• Infrastructure (architectural–hardware and software–enhancements):
- Columnar (a new way to store and access data in the database)
- Real-time data warehousing
- Data warehouse appliances (all-in-one solutions to DW)
- Data management technologies and practices
- In-database processing technology (putting the algorithms where the data is)
- In-memory storage technology (moving the data in the memory for faster processing)
- New database management systems
- Advanced analytics
Chapter 2
Decision Making, Systems, Modeling, and Support

True-False Questions
1. Fast decision-making requirements may be detrimental to decision quality.

Answer: True Difficulty: Easy Page Reference: 48

2. To determine how real decision makers make decisions, we must first understand the process
and the important issues of decision making.

Answer: True Difficulty: Moderate Page Reference: 48

3. Decision making is a process of choosing among two or more alternative courses of action
for the purpose of attaining a goal or goals.

Answer: True Difficulty: Easy Page Reference: 48

4. An important characteristic of management support systems is their emphasis on the


computational efficiency of obtaining a decision, rather than on the effectiveness of the
decision produced.

Answer: False Difficulty: Moderate Page Reference: 49

5. For a computerized system to successfully support a manager, it should fit the decision
situation and not the decision style.

Answer: False Difficulty: Moderate Page Reference: 50

6. A major characteristic of a decision support system is the inclusion of at least one model.

Answer: True Difficulty: Moderate Page Reference: 51

7. The collection of data and the estimation of future data are among the most difficult steps in
the analysis.

Answer: True Difficulty: Easy Page Reference: 56

8. Problem Identification is the conceptualization of a problem in an attempt to place it in a


definable category, possibly leading to a standard solution approach.

Answer: False Difficulty: Hard Page Reference: 56

9. A problem exists in an organization only if someone or some group takes on the


responsibility of attacking it and if the organization has the ability to solve it.

Answer: True Difficulty: Moderate Page Reference: 57

13
10. The process of modeling is pure art and not science.

Answer: False Difficulty: Easy Page Reference: 58

11. The process of modeling involves determining the (usually mathematical, sometimes
symbolic) relationships among the variables.

Answer: True Difficulty: Easy Page Reference: 58

12. An intermediate variable or a set of intermediate variables describe the environment of the
decision making.

Answer: False Difficulty: Moderate Page Reference: 58

13. “Humans are economic beings whose objective is to maximize the attainment of goals “is
one of the assumptions of rational decision makers.

Answer: True Difficulty: Easy Page Reference: 59

14. The idea of “thinking with your gut” is a heuristic approach to decision making.

Answer: True Difficulty: Easy Page Reference: 59

15. If a suboptimal decision is made in one part of the organization without considering the
details of the rest of the organization, then an optimal solution from the point of view of that
part is better for the whole.

Answer: False Difficulty: Moderate Page Reference: 61

16. Rationality is bounded only by limitations on human processing capacities but not by
individual differences.

Answer: False Difficulty: Moderate Page Reference: 64

17. The choice phase is the one in which the actual decision is made and where the commitment
to follow a certain course of action is made.

Answer: True Difficulty: Easy Page Reference: 68

18. Solving the model is the same as solving the problem the model represents.

Answer: False Difficulty: Moderate Page Reference: 69

19. The primary requirement of decision support for the intelligence phase is the ability to scan
external and internal information sources for opportunities and problems.

Answer: True Difficulty: Easy Page Reference: 70

14
20. Alternatives for structured problems can be generated through the use of either standard or
special models.

Answer: True Difficulty: Easy Page Reference: 72

Multiple-Choice Questions
21. Which of the following is the third phase in decision making?

a. Completion
b. Execution
c. Observation
d. Choice

Answer: d Difficulty: Moderate Page Reference: 49

22. Different decision styles require different types of support. A major factor that determines
the type of required support is whether the decision maker is __________.

a. autocratic
b. consultative
c. an individual or a group
d. democratic

Answer: c Difficulty: Moderate Page Reference: 50

23. Which of the following is a physical replica of a system, usually on a different scale from the
original?

a. Complex model
b. Iconic model
c. Duplicated model
d. Composite model

Answer: b Difficulty: Moderate Page Reference: 51

24. Which of the following model behaves like the real system but does not look like it?

a. Composite model
b. Analog model
c. Dense model
d. Iconic model

Answer: b Difficulty: Moderate Page Reference: 51

15
25. There is a continuous flow of activity from one phase to the next phase in a decision making
process, but at any phase there may be a return to a previous phase. __________ is an
essential part of this process.

a. Testing
b. Trial-and-error
c. Experimenting
d. Modeling

Answer: d Difficulty: Moderate Page Reference: 53

26. The identification of organizational goals and objectives related to an issue of concern and
determination of whether they are being met is the beginning of __________ of decision
making.

a. initial phase
b. intelligence phase
c. brainstorming phase
d. generation phase

Answer: b Difficulty: Moderate Page Reference: 56

27. Which of the following involves finding or developing and analyzing possible courses of
action in a decision making phase?

a. Consultation phase
b. Communication phase
c. Intelligence phase
d. Design phase

Answer: d Difficulty: Moderate Page Reference: 57

28. A(n) __________ is a criterion that describes the acceptability of a solution approach.

a. principle of choice
b. acceptable criterion
c. trade-off
d. worst-case criterion

Answer: a Difficulty: Hard Page Reference: 58

29. A(n) __________ describes the objective or goal of the decision-making problem.

a. decision variable
b. result variable
c. initial variable
d. intermediate variable

Answer: b Difficulty: Moderate Page Reference: 58

16
30. Finding the alternatives with the highest ratio of __________ is one of the ways to achieve
optimization.

a. profits to cost
b. margins to cost
c. goal attainment to cost
d. earnings to cost

Answer: c Difficulty: Moderate Page Reference: 59

31. Which of the following, by definition, requires a decision maker to consider the impact of
each alternative course of action on the entire organization because a decision made in one
area may have significant effects in other areas?

a. Satisfaction
b. Worst-case
c. Feasibility
d. Optimization

Answer: d Difficulty: Easy Page Reference: 61

32. A(n) __________ checks the performance of the system for a given set of alternatives, rather
than for all alternatives. Therefore, there is no guarantee that an alternative selected with the
aid of this analysis is optimal.

a. analytical analysis
b. descriptive analysis
c. optimization analysis
d. quantitative analysis

Answer: b Difficulty: Hard Page Reference: 62

33. A __________ can help a decision maker sketch out the important qualitative factors and
their causal relationships in a messy decision-making situation.

a. mathematical map
b. cognitive map
c. qualitative map
d. narrative map

Answer: b Difficulty: Moderate Page Reference: 63

17
34. A __________ describes the decision and uncontrollable variables and parameters for a
specific modeling situation.

a. statement
b. model
c. program
d. scenario

Answer: d Difficulty: Moderate Page Reference: 67

35. Which of the following search approach is not mentioned in searching for an appropriate
course of action for solving a decision-making model?

a. Analytical techniques
b. Algorithms
c. Rules of thumb
d. Tabu search

Answer: d Difficulty: Moderate Page Reference: 69

36. The __________ of a proposed solution to a problem is the initiation of a new order of things
or the introduction of change.

a. method
b. implementation
c. approach
d. style

Answer: b Difficulty: Moderate Page Reference: 69

37. Which of the following is a study of the effect of a change in one or more input variables on
a proposed solution?

a. Sensitivity analysis
b. Boundary analysis
c. Fish bone analysis
d. Input-output analysis

Answer: a Difficulty: Moderate Page Reference: 69

38. One aspect of identifying internal problems is to be able to monitor the current status of
operations. When something goes wrong, it can be identified quickly and the problem solved.
Which of the following is a tool to provide such capability?

a. Business intelligence
b. Simulation model
c. Product life-cycle management
d. Expert systems

Answer: c Difficulty: Moderate Page Reference: 71

18
39. The __________ involves generating alternative courses of action, discussing the criteria for
choice and their relative importance, and forecasting the future consequences of using
various alternatives.

a. initial phase
b. generation phase
c. brainstorming phase
d. design phase

Answer: d Difficulty: Moderate Page Reference: 72

40. All phases of the decision making process can be supported by improved communication by
__________ through group support systems and knowledge management systems.

a. collaborative computing
b. shared computing
c. collective computing
d. group computing

Answer: a Difficulty: Moderate Page Reference: 74

Fill In the Blanks

41. Decision making is directly influenced by several major disciplines, some of which are
behavioral, (which include anthropology, law, philosophy, political science, psychology,
social psychology, and sociology), and some of which are scientific in nature.

Difficulty: Moderate Page Reference: 49

42. A model is a simplified representation or abstraction of reality.

Difficulty: Easy Page Reference: 51

43. A major characteristic of a decision support system and many business intelligence tools is
the inclusion of at least one model.

Difficulty: Easy Page Reference: 51

44. Mental models are the descriptive representations of decision-making situations that we form
in our heads and think about.

Difficulty: Moderate Page Reference: 52

45. Intelligence in decision making involves scanning the environment, either intermittently or
continuously.

Difficulty: Moderate Page Reference: 55

19
46. Problem classification is the conceptualization of a problem in an attempt to place it in a
definable category, possibly leading to a standard solution approach.

Difficulty: Hard Page Reference: 56

47. Problem classification is the conceptualization of a problem in an attempt to place it in a


definable category, possibly leading to a standard solution approach. An important approach
classifies problems according to the degree of structuredness evident in them.

Difficulty: Moderate Page Reference: 56

48. A proper balance between the level of model simplification and the representation of reality
must be obtained because of the benefit/cost trade-off.

Difficulty: Hard Page Reference: 58

49. The process of modeling is a combination of art and science. As an art, a level of creativity
and finesse is required when determining what simplifying assumptions can work, how to
combine appropriate features of the model classes, and how to integrate models to obtain
valid solutions.

Difficulty: Moderate Page Reference: 58

50. A decision variable describes the alternatives a manager must choose among, e.g., like how
many cars to deliver to a specific rental agency or how to advertise at specific times.

Difficulty: Moderate Page Reference: 58

51. A normative model is a model that prescribes how a system should operate.

Difficulty: Moderate Page Reference: 58

52. Suboptimization may also involve simply bounding the search for an optimum by
considering fewer criteria or alternatives or by eliminating large portions of the problem from
evaluation.

Difficulty: Hard Page Reference: 61

53. A descriptive model is extremely useful in DSS for investigating the consequences of various
alternative courses of action under different configurations of inputs and processes.

Difficulty: Moderate Page Reference: 62

54. Simulation is the imitation of reality and has been applied to many areas of decision making.

Difficulty: Easy Page Reference: 62

20
55. Another descriptive decision-making model is the use of narratives to describe a decision-
making situation. It is extremely effective when a group is making a decision and can lead to
a more common frame.

Difficulty: Easy Page Reference: 63

56. Aside from estimating the potential utility or value of a particular decision’s outcome, the
best decision makers are capable of accurately estimating the risk associated with the
resultant outcomes resulting from making each decision.

Difficulty: Easy Page Reference: 67

57. A what-if analysis asks a computer what the effect of changing some of the input data or
parameters would be.

Difficulty: Moderate Page Reference: 69

58. The model is the critical component in the decision-making process, but one may make a
number of errors in its development and use. Validating the model before it is used is
critical.

Difficulty: Easy Page Reference: 68

59. A solution to a model is a specific set of values for the decision variables in a selected
alternative.

Difficulty: Moderate Page Reference: 69

60. An algorithm is a step-by-step search in which improvement is made at every step until the
best solution is found.
Difficulty: Moderate Page Reference: 69

Essay Questions
61. Compare and contrast decision making by an individual with decision making by a group.

Obviously when an individual is making a decision, there are no group dynamics. An


individual can focus in on a problem, work on it, and come up with a solution. With a group,
there can be politicking, groupthink, and other potential dysfunctions. There can also be
synergy, because each member of a group brings different facts and abilities to bear.

Difficulty: Easy Page Reference: 50

21
62. Discuss the importance of decision style.

Decision style is the manner in which decision makers think and react to problems.
This includes their cognitive response, their values, beliefs, and perceptions. These
factors can vary greatly amongst individuals; as a result decisions can vary greatly.

Difficulty: Easy Page Reference: 50

63. Describe the different categories of models.

• Iconic. An iconic model is a physical replica of a system, usually on a different scale.


• Analog. An analog model is more abstract than an iconic model. It is a model that
behaves like a system but does not physically look like the system.
• Mathematical. The complexity of relationships in many organizational systems
cannot be represented by icons or analogically because such representations would
soon become cumbersome, and using them would be time-consuming. Therefore,
more abstract models are described mathematically.

Difficulty: Moderate Page Reference: 51

64. List five benefits of using models.

• Model manipulation is easier than manipulating the real system.


• Models enable compression of time.
• The cost of model analysis is less than the cost of a similar experiment using the real
system.
• The cost of making mistakes during the trial-and-error experiment is less using a
model.
• Models enable managers to estimate the risk of their actions.
• Mathematical models enable analysis of a large number of possible solutions.
• Models enhance learning and training.
• Models are readily available over the Web.
• There are many Java applets that readily solve models.

Difficulty: Moderate Page Reference: 52

65. Briefly describe Simon’s four phases of decision making.

• Intelligence phase. Reality is examined, and the problem is identified and defined.
• Design phase. A model that represents the system is constructed by making
assumptions that simplify reality. The model is then validated, and criteria are
determined for evaluation of the alternative courses of action that are identified.
• Choice phase. Select a proposed solution to the problem.
• Implementation phase. Successful implementation results in solving the real
problem. Failure leads to a return to an earlier phase of the process.

Difficulty: Easy Page Reference: 53

22
66. Briefly describe the steps of intelligence phase of decision making.

• Problem identification. The intelligence phase begins with the identification of


organizational goals and objectives related to an issue of concern, and determination
of whether they are being met.
• Problem classification. Problem classification is the conceptualization of a problem
in an attempt to place it in a definable category.
• Problem decomposition. Many complex problems can be divided into subproblems.
Solving the simpler subproblems may help in solving the complex problem.
• Problem ownership. A problem exists in an organization only if someone or some
group takes on the responsibility of attacking it and if the organization has the ability
to solve it.

Difficulty: Moderate Page Reference: 55

67. Briefly describe problem decomposition.

Many complex problems can be divided into subproblems. Solving the simpler subproblems
may help in solving the complex problem. Also, seemingly poorly structured problems
sometimes have highly structured subproblems. Just as a semistructured problem results
when some phases of decision making are structured while other phases are unstructured, so
when some subproblems of a decision making problem are structured with others
unstructured, the problem itself is semistructured. Decomposition also facilitates
communication among decision makers.

Difficulty: Moderate Page Reference: 56

68. Describe the three assumptions of rational decision makers used in Normative decision
theory.

• Humans are economic beings whose objective is to maximize the attainment of


goals; that is, the decision maker is rational. (More of a good thing [revenue, fun] is
better than less; less of a bad thing [cost, pain] is better than more.)
• For a decision-making situation, all viable alternative courses of action and their
consequences, or at least the probability and the values of the consequences, are
known.
• Decision makers have an order or preference that enables them to rank the
desirability of all consequences of the analysis (best to worst).

Difficulty: Moderate Page Reference: 59

69. Compare the normative and descriptive approaches to decision making.

Normative refers to models that tell you what you should do. These are prescriptive
models that usually utilize optimization.
Descriptive models are those that tell you "what-if." These are usually simulation
models.

Difficulty: Easy Page Reference: 62

23
70. Discuss why scenarios play an important role in management support systems.

• They help identify opportunities and problem areas.


• They provide flexibility in planning.
• They identify the leading edges of changes that management should monitor.
• They help validate major modeling assumptions.
• They allow the decision maker to explore the behavior of a system through a model.
• They help to check the sensitivity of proposed solutions to changes in the
environment as described by the scenario.

Difficulty: Easy Page Reference: 68

24
lOMoARcPSD|7465459

Sharda bia10e tif 14

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 14 Business Analytics: Emerging Trends and Future Impacts

1) Oklahoma Gas & Electric employs a two-layer information architecture involving data
warehouse and improved and expanded integration.
Answer: FALSE
Diff: 2 Page Ref: 593

2) In the classification of location-based analytic applications, examining geographic site


locations falls in the consumer-oriented category.
Answer: FALSE
Diff: 2 Page Ref: 595

3) In the Great Clips case study, the company uses geospatial data to analyze, among other
things, the types of haircuts most popular in different geographic locations.
Answer: FALSE
Diff: 2 Page Ref: 596-597

4) From massive amounts of high-dimensional location data, algorithms that reduce the
dimensionality of the data can be used to uncover trends, meaning, and relationships to
eventually produce human-understandable representations.
Answer: TRUE
Diff: 2 Page Ref: 598

5) In the life coach case study, Kaggle recently hosted a competition aimed at identifying muscle
motions that may be used to predict the progression of Alzheimer's disease.
Answer: TRUE
Diff: 2 Page Ref: 601

6) Content-based filtering approaches are widely used in recommending textual content such as
news items and related Web pages.
Answer: TRUE
Diff: 2 Page Ref: 604

7) The basic premise behind social networking is that it gives people the power to share, making
the world more open and connected.
Answer: TRUE
Diff: 2 Page Ref: 605

8) Cloud computing originates from a reference to the Internet as a "cloud" and is a combination
of several information technology components as services.
Answer: TRUE
Diff: 2 Page Ref: 607

9) Web-based e-mail such as Google's Gmail are not examples of cloud computing.
Answer: FALSE
Diff: 2 Page Ref: 607
1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Service-oriented DSS solutions generally offer individual or bundled services to the user as a
service.
Answer: TRUE
Diff: 2 Page Ref: 608

11) Data-as-a-service began with the notion that data quality could happen in a centralized place,
cleansing and enriching data and offering it to different systems, applications, or users,
irrespective of where they were in the organization, computers, or on the network.
Answer: TRUE
Diff: 2 Page Ref: 609

12) In service-oriented DSS, an application programming interface (API) serves to populate


source systems with raw data and to pull operational reports.
Answer: TRUE
Diff: 2 Page Ref: 610

13) IaaS helps provide faster information, but provides information only to managers in an
organization.
Answer: FALSE
Diff: 2 Page Ref: 611

14) The trend in the consumption of data analytics is away from in-memory solution and towards
mobile devices.
Answer: FALSE
Diff: 2 Page Ref: 611

15) While cloud services are useful for small and midsize analytic applications, they are still
limited in their ability to handle Big Data applications.
Answer: FALSE
Diff: 2 Page Ref: 612

16) Analytics integration with other organizational systems makes it harder to identify its impact
on the organization.
Answer: TRUE
Diff: 2 Page Ref: 613

17) Use of automated decision systems (ADSs) is likely to result in a reduction of middle
management.
Answer: TRUE
Diff: 1 Page Ref: 614

18) The industry impact of an automated decision system's use is limited to the company's supply
chain.
Answer: FALSE
Diff: 2 Page Ref: 614

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

19) ES/DSS were found to improve the performance of new managers but not existing managers.
Answer: FALSE
Diff: 2 Page Ref: 615

20) In designing analytic systems, it must be kept in mind that the right to an individual's privacy
is not absolute.
Answer: TRUE
Diff: 2 Page Ref: 617

21) What kind of location based analytics is real-time marketing promotion?


A) organization-oriented geospatial static approach
B) organization-oriented location-based dynamic approach
C) consumer-oriented geospatial static approach
D) consumer-oriented location-based dynamic approach
Answer: B
Diff: 2 Page Ref: 595

22) GPS Navigation is an example of which kind of location based analytics?


A) organization-oriented geospatial static approach
B) organization-oriented location-based dynamic approach
C) consumer-oriented geospatial static approach
D) consumer-oriented location-based dynamic approach
Answer: C
Diff: 2 Page Ref: 595

23) What new geometric data type in Teradata's data warehouse captures geospatial features?
A) NAVTEQ
B) ST_GEOMETRY
C) GIS
D) SQL/MM
Answer: B
Diff: 2 Page Ref: 596

24) A British company called Path Intelligence has developed a system that ascertains how
people move within a city or even within a store. What is this system called?
A) Pathfinder
B) PathMiner
C) Footpath
D) Pathdata
Answer: C
Diff: 2 Page Ref: 598

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) Today, most smartphones are equipped with various instruments to measure jerk, orientation,
and sense motion. One of these instruments is an accelerometer, and the other is a(n)
A) potentiometer.
B) gyroscope.
C) microscope.
D) oscilloscope.
Answer: B
Diff: 2 Page Ref: 601

26) Content-based filtering obtains detailed information about item characteristics and restricts
this process to a single user using information tags or
A) keywords.
B) passphrases.
C) key-pairs.
D) reality mining.
Answer: A
Diff: 2 Page Ref: 604

27) Service-oriented thinking is one of the fastest growing paradigms in today's economy. Which
of the following is NOT a characteristic of service-oriented DSS?
A) reusability
B) substitutability
C) extensibility
D) originality
Answer: D
Diff: 2 Page Ref: 608

28) All of the following are components in a service-oriented DSS environment EXCEPT
A) information technology as enabler.
B) data as infrastructure.
C) process as beneficiary.
D) people as user.
Answer: B
Diff: 2 Page Ref: 608

29) Which of the following is true of data-as-a-Service (DaaS) platforms?


A) Knowing where the data resides is critical to the functioning of the platform.
B) There are standardized processes for accessing data wherever it is located.
C) Business processes can access local data only.
D) Data quality happens on each individual platform.
Answer: B
Diff: 2 Page Ref: 608

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) Which component of service-oriented DSS can be described as a subset of a data warehouse
that supports specific decision and analytical needs and provides business units more flexibility,
control, and responsibility?
A) information delivery portals
B) information services with library and administrator
C) extract, transform, load
D) data marts
Answer: D
Diff: 2 Page Ref: 610

31) Which component of service-oriented DSS can be described as optimizing the DSS
environment use by organizing its capabilities and knowledge, and assimilating them into the
business processes?
A) information delivery portals
B) information services with library and administrator
C) extract, transform, load
D) data marts
Answer: B
Diff: 2 Page Ref: 610

32) Which component of service-oriented DSS can be defined as data that describes the meaning
and structure of business data, as well as how it is created, accessed, and used?
A) application programming interface
B) analytics
C) operations and administration
D) metadata management
Answer: D
Diff: 2 Page Ref: 610

33) Which component of service-oriented DSS includes such examples as optimization, data
mining, text mining, simulation, automated decision systems?
A) application programming interface
B) analytics
C) operations and administration
D) metadata management
Answer: B
Diff: 2 Page Ref: 610

34) Which of the following offers a flexible data integration platform based on a newer
generation of service-oriented standards that enables ubiquitous access to any type of data?
A) EAI
B) EII
C) IaaS
D) ETL
Answer: C
Diff: 2 Page Ref: 611

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) When new analytics applications are introduced and affect multiple related processes and
departments, the organization is best served by utilizing
A) business flow management.
B) multi-department analysis.
C) process flow analysis.
D) business process reengineering.
Answer: D
Diff: 2 Page Ref: 614

36) Research into managerial use of DSS and expert systems found all the following EXCEPT
A) managers spent more of their time planning.
B) managers saw their decision making quality enhanced.
C) managers spent more time in the office and less in the field.
D) managers were able to devote less of their time fighting fires.
Answer: C
Diff: 2 Page Ref: 615

37) Why do analytics applications have the effect of redistributing power among managers?
A) The more information and analysis managers have, the more power they possess.
B) Sponsoring an analytics system automatically confers power to a manager.
C) New analytics applications change managers' job expectations.
D) New analytics systems lead to new budget allocations, resulting in increased power.
Answer: A
Diff: 2 Page Ref: 616

38) Services that let consumers permanently enter a profile of information along with a password
and use this information repeatedly to access services at multiple sites are called
A) consumer access applications.
B) information collection portals.
C) single-sign-on facilities.
D) consumer information sign on facilities.
Answer: C
Diff: 2 Page Ref: 617

39) Which of the following is true about the furtherance of homeland security?
A) There is a lessening of privacy issues.
B) There is a greater need for oversight.
C) The impetus was the need to harvest information related to financial fraud after 2001.
D) Most people regard analytic tools as mostly ineffective in increasing security.
Answer: B
Diff: 2 Page Ref: 618

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) Which of the following is considered the economic engine of the whole analytics industry?
A) application developers and system integrators
B) analytics user organizations
C) analytics industry analysts and influencers
D) academic providers and certification industries
Answer: B
Diff: 2 Page Ref: 625

41) In the opening vignette, the combination of filed infrastructure, geospatial data, enterprise
data warehouse, and analytics has enabled OG&E to manage its customer demand in such a way
that it can optimize its ________ investments.
Answer: long-term
Diff: 2 Page Ref: 593

42) A critical emerging trend in analytics is the incorporation of location data. ________ data is
the static location data used by these location-based analytic applications.
Answer: Geospatial
Diff: 2 Page Ref: 594

43) The surge in location-enabled services has resulted in ________ mining, the analytics of
massive databases of historical and real-time streaming location information.
Answer: reality
Diff: 2 Page Ref: 598

44) The Radii mobile app collects information about the user's habits, interests, spending
patterns, and favorite locations to understand the user's ________.
Answer: personality
Diff: 2 Page Ref: 599

45) Predictive analytics is beginning to enable development of software that is directly used by a
consumer. One key concern in employing these technologies is the loss of ________.
Answer: privacy
Diff: 2 Page Ref: 602

46) Collaborative filtering is usually done by building a user-item ratings matrix where each row
represents a unique user and each column gives the individual item rating made by the user. The
resultant matrix is a dynamic, sparse matrix with a huge ________.
Answer: dimensionality
Diff: 2 Page Ref: 603

47) ________, which stands for Asynchronous JavaScript and XML, is an effective and efficient
Web development technique for creating interactive Web applications.
Answer: Ajax
Diff: 2 Page Ref: 605

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

48) ________ (IaaS) promises to eliminate independent silos of data that exist in systems and
infrastructure and enable sharing real-time information for emerging apps, to hide complexity,
and to increase availability with virtualization.
Answer: Information-as-a-service
Diff: 3 Page Ref: 611

49) IaaS, AaaS and other ________-based offerings allow the rapid diffusion of advanced
analysis tools among users, without significant investment in technology acquisition.
Answer: cloud
Diff: 2 Page Ref: 612

50) A major structural change that can occur when analytics are introduced into an organization
is the creation of new organizational ________.
Answer: units
Diff: 2 Page Ref: 613

51) When an organization-wide, major restructuring is needed, the process is referred to as


________.
Answer: reengineering
Diff: 2 Page Ref: 614

52) ADSs can lead in many cases to improved customer ________ (e.g., responding faster to
queries).
Answer: service
Diff: 2 Page Ref: 614

53) A research study found that employees using ADS systems were more ________ with their
jobs.
Answer: satisfied
Diff: 2 Page Ref: 614

54) Analytics can change the way in which many ________ are made by managers and can
consequently change their jobs.
Answer: decisions
Diff: 2 Page Ref: 615

55) As face-to-face communication is often replaced by e-mail, wikis, and computerized


conferencing, leadership qualities attributed to physical ________ could become less important.
Answer: appearance
Diff: 2 Page Ref: 615

56) Location information from ________ phones can be used to create profiles of user behavior
and movement.
Answer: mobile/cell
Diff: 2 Page Ref: 617

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

57) For individual decision makers, ________ values constitute a major factor in the issue of
ethical decision making.
Answer: personal
Diff: 2 Page Ref: 619

58) Firms such as Nielsen provide ________ data collection, aggregation, and distribution
mechanisms and typically focus on one industry sector.
Answer: specialized
Diff: 2 Page Ref: 622

59) Possibly the biggest recent growth in analytics has been in ________ analytics, as many
statistical software companies such as SAS and SPSS embraced it early on.
Answer: predictive
Diff: 2 Page Ref: 623

60) Southern States Cooperative used analytics to prepare the customized catalogs to suit the
targeted ________ needs, resulting in better revenue generation.
Answer: customer
Diff: 2 Page Ref: 631

61) How does Oklahoma Gas and Electric use the Teradata platform to manage the electric grid?
Answer: Oklahoma Gas and Electric uses the Teradata platform to organize the large amounts of
data that it gathers from installation of smart meters and other devices on the electronic grid at
the consumer end? With Teradata's platform, OG&E has combined its smart meter data, outage
data, call center data, rate data, asset data, price signals, billing, and collections into one
integrated data platform. The platform also incorporates geospatial mapping of the integrated
data using the in-database geospatial analytics that add onto the OG&E's dynamic segmentation
capabilities.
Diff: 2 Page Ref: 593

62) How do the traditional location-based analytic techniques using geocoding of organizational
locations and consumers hamper the organizations in understanding "true location-based"
impacts?
Answer: Locations based on postal codes offer an aggregate view of a large geographic area.
This poor granularity may not be able to pinpoint the growth opportunities within a region. The
location of the target customers can change rapidly. An organization's promotional campaigns
might not target the right customers.
Diff: 2 Page Ref: 595

63) In what ways can communications companies use geospatial analysis to harness their data
effectively?
Answer: Communication companies often generate massive amounts of data every day. The
ability to analyze the data quickly with a high level of location-specific granularity can better
identify the customer churn and help in formulating strategies specific to locations for increasing
operational efficiency, quality of service, and revenue.
Diff: 2 Page Ref: 597

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

64) Describe the CabSense application used by the New York City Taxi and Limousine
Commission.
Answer: Sense Networks has built a mobile application called CabSense that analyzes large
amounts of data from the New York City Taxi and Limousine Commission. CabSense helps
New Yorkers and visitors in finding the best corners for hailing a taxi based on the person's
location, day of the week, and time. CabSense rates the street corners on a 5-point scale by
making use of machine-learning algorithms applied to the vast amounts of historical location
points obtained from the pickups and drop-offs of all New York City cabs. Although the app
does not give the exact location of cabs in real time, its data-crunching predictions enable people
to get to a street corner that has the highest probability of finding a cab.
Diff: 3 Page Ref: 600

65) What are recommender systems, how are they developed, and how is the data used to build a
recommendation system obtained?
Answer:
• The term recommender systems refers to a Web-based information filtering system that takes
the inputs from users and then aggregates the inputs to provide recommendations for other users
in their product or service selection choices.
• Two basic approaches that are employed in the development of recommendation systems are
collaborative filtering and content filtering.
In collaborative filtering, the recommendation system is built based on the individual user's
past behavior by keeping track of the previous history of all purchased items. This includes
products, items that are viewed most often, and ratings that are given by the users to the items
they purchased.
In the content-based filtering approach, the characteristics of an item are profiled first and
then content-based individual user profiles are built to store the information about the
characteristics of specific items that the user has rated in the past. In the recommendation
process, a comparison is made by filtering the item information from the user profile for which
the user has rated positively and compares these characteristics with any new products that the
user has not rated yet. Recommendations are made if there are similarities found in the item
characteristics.
• The data necessary to build a recommendation system are collected by Web-based systems
where each user is specifically asked to rate an item on a rating scale, rank the items from most
favorite to least favorite, and/or ask the user to list the attributes of the items that the user likes.
Diff: 3 Page Ref: 603

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

66) Web 2.0 is the popular term for describing advanced Web technologies and applications.
Describe four main representative characteristics of the Web 2.0 environment.
Answer:
• Web 2.0 has the ability to tap into the collective intelligence of users. The more users
contribute, the more popular and valuable a Web 2.0 site becomes.
• Data is made available in new or never-intended ways. Web 2.0 data can be remixed or
"mashed up," often through Web service interfaces, much the way a dance-club DJ mixes music.
• Web 2.0 relies on user-generated and user-controlled content and data.
• Lightweight programming techniques and tools let nearly anyone act as a Web site developer.
• The virtual elimination of software-upgrade cycles makes everything a perpetual beta or
work-in-progress and allows rapid prototyping, using the Web as an application development
platform.
• Users can access applications entirely through a browser.
• An architecture of participation and digital democracy encourages users to add value to the
application as they use it.
• A major emphasis is on social networks and computing.
• There is strong support for information sharing and collaboration.
• Web 2.0 fosters rapid and continuous creation of new business models.
Diff: 3 Page Ref: 605

67) What is mobile social network and how does it extend the reach of popular social networks?
Answer: Mobile social networking refers to social networking where members converse and
connect with one another using cell phones or other mobile devices. Virtually all major social
networking sites offer mobile services or apps on smartphones to access their services. The
explosion of mobile Web 2.0 services and companies means that many social networks can be
based from cell phones and other portable devices, extending the reach of such networks to the
millions of people who lack regular or easy access to computers.
Diff: 2 Page Ref: 606

68) What is cloud computing? What is Amazon's general approach to the cloud computing
services it provides?
Answer:
• Wikipedia defines cloud computing as "a style of computing in which dynamically scalable
and often virtualized resources are provided over the Internet. Users need not have knowledge of,
experience in, or control over the technology infrastructures in the cloud that supports them."
• Amazon.com has developed an impressive technology infrastructure for e- commerce as well
as for business intelligence, customer relationship management, and supply chain management.
It has built major data centers to manage its own operations. However, through Amazon.com's
cloud services, many other companies can employ these very same facilities to gain advantages
of these technologies without having to make a similar investment. Like other cloud-computing
services, a user can subscribe to any of the facilities on a pay-as-you-go basis. This model of
letting someone else own the hardware and software but making use of the facilities on a pay-
per-use basis is the cornerstone of cloud computing.
Diff: 2 Page Ref: 607

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

69) Data and text mining is a promising application of AaaS. What additional capabilities can
AaaS bring to the analytic world?
Answer: It can also be used for large-scale optimization, highly-complex multi-criteria decision
problems, and distributed simulation models. These prescriptive analytics require highly capable
systems that can only be realized using service-based collaborative systems that can utilize large-
scale computational resources.
Diff: 3 Page Ref: 612

70) Describe your understanding of the emerging term people analytics. Are there any privacy
issues associated with the application?
Answer:
• Applications such as using sensor-embedded badges that employees wear to track their
movement and predict behavior has resulted in the term people analytics. This application area
combines organizational IT impact, Big Data, sensors, and has privacy concerns. One company,
Sociometric Solutions, has reported several such applications of their sensor-embedded badges.
• People analytics creates major privacy issues. Should the companies be able to monitor their
employees this intrusively? Sociometric has reported that its analytics are only reported on an
aggregate basis to their clients. No individual user data is shared. They have noted that some
employers want to get individual employee data, but their contract explicitly prohibits this type
of sharing. In any case, sensors are leading to another level of surveillance and analytics, which
poses interesting privacy, legal, and ethical questions.
Diff: 2 Page Ref: 619

12
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 13

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 13 Big Data and Analytics

1) In the opening vignette, the CERN Data Aggregation System (DAS), built on MongoDB (a
Big Data management infrastructure), used relational database technology.
Answer: FALSE
Diff: 2 Page Ref: 544

2) The term "Big Data" is relative as it depends on the size of the using organization.
Answer: TRUE
Diff: 2 Page Ref: 546

3) In the Luxottica case study, outsourcing enhanced the ability of the company to gain insights
into their data.
Answer: FALSE
Diff: 2 Page Ref: 550-551

4) Many analytics tools are too complex for the average user, and this is one justification for Big
Data.
Answer: TRUE
Diff: 2 Page Ref: 552

5) In the investment bank case study, the major benefit brought about by the supplanting of
multiple databases by the new trade operational store was providing real-time access to trading
data.
Answer: TRUE
Diff: 2 Page Ref: 555

6) Big Data uses commodity hardware, which is expensive, specialized hardware that is custom
built for a client or application.
Answer: FALSE
Diff: 2 Page Ref: 556

7) MapReduce can be easily understood by skilled programmers due to its procedural nature.
Answer: TRUE
Diff: 2 Page Ref: 558

8) Hadoop was designed to handle petabytes and extabytes of data distributed over multiple
nodes in parallel.
Answer: TRUE
Diff: 2 Page Ref: 558

9) Hadoop and MapReduce require each other to work.


Answer: FALSE
Diff: 2 Page Ref: 562

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) In most cases, Hadoop is used to replace data warehouses.


Answer: FALSE
Diff: 2 Page Ref: 562

11) Despite their potential, many current NoSQL tools lack mature management and monitoring
tools.
Answer: TRUE
Diff: 2 Page Ref: 562

12) The data scientist is a profession for a field that is still largely being defined.
Answer: TRUE
Diff: 2 Page Ref: 565

13) There is a current undersupply of data scientists for the Big Data market.
Answer: TRUE
Diff: 2 Page Ref: 567

14) The Big Data and Analysis in Politics case study makes it clear that the unpredictability of
elections makes politics an unsuitable arena for Big Data.
Answer: FALSE
Diff: 2 Page Ref: 568

15) For low latency, interactive reports, a data warehouse is preferable to Hadoop.
Answer: TRUE
Diff: 2 Page Ref: 573

16) If you have many flexible programming languages running in parallel, Hadoop is preferable
to a data warehouse.
Answer: TRUE
Diff: 2 Page Ref: 573

17) In the Dublin City Council case study, GPS data from the city's buses and CCTV were the
only data sources for the Big Data GIS-based application.
Answer: FALSE
Diff: 2 Page Ref: 575-576

18) It is important for Big Data and self-service business intelligence go hand in hand to get
maximum value from analytics.
Answer: TRUE
Diff: 1 Page Ref: 579

19) Big Data simplifies data governance issues, especially for global firms.
Answer: FALSE
Diff: 2 Page Ref: 580

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

20) Current total storage capacity lags behind the digital information being generated in the
world.
Answer: TRUE
Diff: 2 Page Ref: 581

21) Using data to understand customers/clients and business operations to sustain and foster
growth and profitability is
A) easier with the advent of BI and Big Data.
B) essentially the same now as it has always been.
C) an increasingly challenging task for today's enterprises.
D) now completely automated with no human intervention required.
Answer: C
Diff: 2 Page Ref: 546

22) A newly popular unit of data in the Big Data era is the petabyte (PB), which is
A) 109 bytes.
B) 1012 bytes.
C) 1015 bytes.
D) 1018 bytes.
Answer: C
Diff: 2 Page Ref: 548

23) Which of the following sources is likely to produce Big Data the fastest?
A) order entry clerks
B) cashiers
C) RFID tags
D) online customers
Answer: C
Diff: 2 Page Ref: 549

24) Data flows can be highly inconsistent, with periodic peaks, making data loads hard to
manage. What is this feature of Big Data called?
A) volatility
B) periodicity
C) inconsistency
D) variability
Answer: D
Diff: 2 Page Ref: 549

25) In the Luxottica case study, what technique did the company use to gain visibility into its
customers?
A) visibility analytics
B) data integration
C) focus on growth
D) customer focus
Answer: B
Diff: 2 Page Ref: 550-551
3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

26) Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes
can solve complex problems in near—real time with highly accurate insights. What is this
process called?
A) in-memory analytics
B) in-database analytics
C) grid computing
D) appliances
Answer: A
Diff: 2 Page Ref: 553

27) Which Big Data approach promotes efficiency, lower cost, and better performance by
processing jobs in a shared, centrally managed pool of IT resources?
A) in-memory analytics
B) in-database analytics
C) grid computing
D) appliances
Answer: C
Diff: 2 Page Ref: 553

28) How does Hadoop work?


A) It integrates Big Data into a whole so large data elements can be processed as a whole on one
computer.
B) It integrates Big Data into a whole so large data elements can be processed as a whole on
multiple computers.
C) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the
same time on one computer.
D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the
same time on multiple computers.
Answer: D
Diff: 3 Page Ref: 558

29) What is the Hadoop Distributed File System (HDFS) designed to handle?
A) unstructured and semistructured relational data
B) unstructured and semistructured non-relational data
C) structured and semistructured relational data
D) structured and semistructured non-relational data
Answer: B
Diff: 2 Page Ref: 558

30) In a Hadoop "stack," what is a slave node?


A) a node where bits of programs are stored
B) a node where metadata is stored and used to organize data processing
C) a node where data is stored and processed
D) a node responsible for holding all the source programs
Answer: C
Diff: 2 Page Ref: 559

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

31) In a Hadoop "stack," what node periodically replicates and stores data from the Name Node
should it fail?
A) backup node
B) secondary node
C) substitute node
D) slave node
Answer: B
Diff: 2 Page Ref: 559

32) All of the following statements about MapReduce are true EXCEPT
A) MapReduce is a general-purpose execution engine.
B) MapReduce handles the complexities of network communication.
C) MapReduce handles parallel programming.
D) MapReduce runs without fault tolerance.
Answer: D
Diff: 2 Page Ref: 562

33) In the Big Data and Analytics in Politics case study, which of the following was an input to
the analytic system?
A) census data
B) assessment of sentiment
C) voter mobilization
D) group clustering
Answer: A
Diff: 2 Page Ref: 568

34) In the Big Data and Analytics in Politics case study, what was the analytic system output or
goal?
A) census data
B) assessment of sentiment
C) voter mobilization
D) group clustering
Answer: C
Diff: 2 Page Ref: 568

35) Traditional data warehouses have not been able to keep up with
A) the evolution of the SQL language.
B) the variety and complexity of data.
C) expert systems that run on them.
D) OLAP.
Answer: B
Diff: 2 Page Ref: 570

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

36) Under which of the following requirements would it be more appropriate to use Hadoop over
a data warehouse?
A) ANSI 2003 SQL compliance is required
B) online archives alternative to tape
C) unrestricted, ungoverned sandbox explorations
D) analysis of provisional data
Answer: C
Diff: 2 Page Ref: 573

37) What is Big Data's relationship to the cloud?


A) Hadoop cannot be deployed effectively in the cloud just yet.
B) Amazon and Google have working Hadoop cloud offerings.
C) IBM's homegrown Hadoop platform is the only option.
D) Only MapReduce works in the cloud; Hadoop does not.
Answer: B
Diff: 2 Page Ref: 575-577

38) Companies with the largest revenues from Big Data tend to be
A) the largest computer and IT services firms.
B) small computer and IT services firms.
C) pure open source Big Data firms.
D) non-U.S. Big Data firms.
Answer: A
Diff: 2 Page Ref: 578

39) In the health sciences, the largest potential source of Big Data comes from
A) accounting systems.
B) human resources.
C) patient monitoring.
D) research administration.
Answer: C
Diff: 2 Page Ref: 587

40) In the Discovery Health insurance case study, the analytics application used available data to
help the company do all of the following EXCEPT
A) predict customer health.
B) detect fraud.
C) lower costs for members.
D) open its own pharmacy.
Answer: D
Diff: 2 Page Ref: 589-591

41) Most Big Data is generated automatically by ________.


Answer: machines
Diff: 2 Page Ref: 546

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

42) ________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness
of the data.
Answer: Veracity
Diff: 2 Page Ref: 549

43) In-motion ________ is often overlooked today in the world of BI and Big Data.
Answer: analytics
Diff: 2 Page Ref: 549

44) The ________ of Big Data is its potential to contain more useful patterns and interesting
anomalies than "small" data.
Answer: value proposition
Diff: 2 Page Ref: 549

45) As the size and the complexity of analytical systems increase, the need for more ________
analytical systems is also increasing to obtain the best performance.
Answer: efficient
Diff: 2 Page Ref: 553

46) ________ speeds time to insights and enables better data governance by performing data
integration and analytic functions inside the database.
Answer: In-database analytics
Diff: 2 Page Ref: 553

47) ________ bring together hardware and software in a physical unit that is not only fast but
also scalable on an as-needed basis.
Answer: Appliances
Diff: 2 Page Ref: 553

48) Big Data employs ________ processing techniques and nonrelational data storage
capabilities in order to process unstructured and semistructured data.
Answer: parallel
Diff: 2 Page Ref: 556

49) In the world of Big Data, ________ aids organizations in processing and analyzing large
volumes of multi-structured data. Examples include indexing and search, graph analysis, etc.
Answer: MapReduce
Diff: 2 Page Ref: 558

50) The ________ Node in a Hadoop cluster provides client information on where in the cluster
particular data is stored and if any nodes fail.
Answer: Name
Diff: 2 Page Ref: 559

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

51) A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs,
or the processing of the data.
Answer: tracker
Diff: 2 Page Ref: 559

52) HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop.
Answer: database
Diff: 2 Page Ref: 560

53) Hadoop is primarily a(n) ________ file system and lacks capabilities we'd associate with a
DBMS, such as indexing, random access to data, and support for SQL.
Answer: distributed
Diff: 2 Page Ref: 561

54) HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases.
Answer: NoSQL
Diff: 2 Page Ref: 562

55) In the eBay use case study, load ________ helped the company meet its Big Data needs with
the extremely fast data handling and application availability requirements.
Answer: balancing
Diff: 2 Page Ref: 563

56) As volumes of Big Data arrive from multiple sources such as sensors, machines, social
media, and clickstream interactions, the first step is to ________ all the data reliably and cost
effectively.
Answer: capture
Diff: 2 Page Ref: 570

57) In open-source databases, the most important performance enhancement to date is the cost-
based ________.
Answer: optimizer
Diff: 2 Page Ref: 571

58) Data ________ or pulling of data from multiple subject areas and numerous applications into
one repository is the raison d'être for data warehouses.
Answer: integration
Diff: 2 Page Ref: 572

59) In the energy industry, ________ grids are one of the most impactful applications of stream
analytics.
Answer: smart
Diff: 2 Page Ref: 582

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

60) In the U.S. telecommunications company case study, the use of analytics via dashboards has
helped to improve the effectiveness of the company's ________ assessments and to make their
systems more secure.
Answer: threat
Diff: 2 Page Ref: 586

61) In the opening vignette, what is the source of the Big Data collected at the European
Organization for Nuclear Research or CERN?
Answer: Forty million times per second, particles collide within the LHC, each collision
generating particles that often decay in complex ways into even more particles. Precise electronic
circuits all around LHC record the passage of each particle via a detector as a series of electronic
signals, and send the data to the CERN Data Centre (DC) for recording and digital
reconstruction. The digitized summary of data is recorded as a "collision event." 15 petabytes or
so of digitized summary data produced annually and this is processed by physicists to determine
if the collisions have thrown up any interesting physics.
Diff: 2 Page Ref: 543

62) List and describe the three main "V"s that characterize Big Data.
Answer:
• Volume: This is obviously the most common trait of Big Data. Many factors contributed to
the exponential increase in data volume, such as transaction-based data stored through the years,
text data constantly streaming in from social media, increasing amounts of sensor data being
collected, automatically generated RFID and GPS data, and so forth.
• Variety: Data today comes in all types of formats–ranging from traditional databases to
hierarchical data stores created by the end users and OLAP systems, to text documents, e-mail,
XML, meter-collected, sensor-captured data, to video, audio, and stock ticker data. By some
estimates, 80 to 85 percent of all organizations' data is in some sort of unstructured or
semistructured format
• Velocity: This refers to both how fast data is being produced and how fast the data must be
processed (i.e., captured, stored, and analyzed) to meet the need or demand. RFID tags,
automated sensors, GPS devices, and smart meters are driving an increasing need to deal with
torrents of data in near—real time.
Diff: 2 Page Ref: 547-549

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

63) List and describe four of the most critical success factors for Big Data analytics.
Answer:
• A clear business need (alignment with the vision and the strategy). Business investments
ought to be made for the good of the business, not for the sake of mere technology
advancements. Therefore the main driver for Big Data analytics should be the needs of the
business at any level–strategic, tactical, and operations.
• Strong, committed sponsorship (executive champion). It is a well-known fact that if you
don't have strong, committed executive sponsorship, it is difficult (if not impossible) to succeed.
If the scope is a single or a few analytical applications, the sponsorship can be at the
departmental level. However, if the target is enterprise-wide organizational transformation,
which is often the case for Big Data initiatives, sponsorship needs to be at the highest levels and
organization-wide.
• Alignment between the business and IT strategy. It is essential to make sure that the
analytics work is always supporting the business strategy, and not other way around. Analytics
should play the enabling role in successful execution of the business strategy.
• A fact-based decision making culture. In a fact-based decision-making culture, the numbers
rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of
experimentation to see what works and doesn't. To create a fact-based decision-making culture,
senior management needs to do the following: recognize that some people can't or won't adjust;
be a vocal supporter; stress that outdated methods must be discontinued; ask to see what
analytics went into decisions; link incentives and compensation to desired behaviors.
• A strong data infrastructure. Data warehouses have provided the data infrastructure for
analytics. This infrastructure is changing and being enhanced in the Big Data era with new
technologies. Success requires marrying the old with the new for a holistic infrastructure that
works synergistically.
Diff: 2 Page Ref: 553

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

64) When considering Big Data projects and architecture, list and describe five challenges
designers should be mindful of in order to make the journey to analytics competency less
stressful.
Answer:
• Data volume: The ability to capture, store, and process the huge volume of data at an
acceptable speed so that the latest information is available to decision makers when they need it.
• Data integration: The ability to combine data that is not similar in structure or source and to
do so quickly and at reasonable cost.
• Processing capabilities: The ability to process the data quickly, as it is captured. The
traditional way of collecting and then processing the data may not work. In many situations data
needs to be analyzed as soon as it is captured to leverage the most value.
• Data governance: The ability to keep up with the security, privacy, ownership, and quality
issues of Big Data. As the volume, variety (format and source), and velocity of data change, so
should the capabilities of governance practices.
• Skills availability: Big Data is being harnessed with new tools and is being looked at in
different ways. There is a shortage of data scientists with the skills to do the job.
• Solution cost: Since Big Data has opened up a world of possible business improvements,
there is a great deal of experimentation and discovery taking place to determine the patterns that
matter and the insights that turn to value. To ensure a positive ROI on a Big Data project,
therefore, it is crucial to reduce the cost of the solutions used to find that value.
Diff: 3 Page Ref: 554

65) Define MapReduce.


Answer: As described by Dean and Ghemawat (2004), "MapReduce is a programming model
and an associated implementation for processing and generating large data sets. Programs written
in this functional style are automatically parallelized and executed on a large cluster of
commodity machines. This allows programmers without any experience with parallel and
distributed systems to easily utilize the resources of a large distributed system."
Diff: 2 Page Ref: 557-558

66) What is NoSQL as used for Big Data? Describe its major downsides.
Answer:
• NoSQL is a new style of database that has emerged to, like Hadoop, process large volumes of
multi-structured data. However, whereas Hadoop is adept at supporting large-scale, batch-style
historical analysis, NoSQL databases are aimed, for the most part (though there are some
important exceptions), at serving up discrete data stored among large volumes of multi-
structured data to end-user and automated Big Data applications. This capability is sorely lacking
from relational database technology, which simply can't maintain needed application
performance levels at Big Data scale.
• The downside of most NoSQL databases today is that they trade ACID (atomicity,
consistency, isolation, durability) compliance for performance and scalability. Many also lack
mature management and monitoring tools.
Diff: 2 Page Ref: 562

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

67) What is a data scientist and what does the job involve?
Answer: A data scientist is a role or a job frequently associated with Big Data or data science. In
a very short time it has become one of the most sought-out roles in the marketplace. Currently,
data scientists' most basic, current skill is the ability to write code (in the latest Big Data
languages and platforms). A more enduring skill will be the need for data scientists to
communicate in a language that all their stakeholders understand–and to demonstrate the special
skills involved in storytelling with data, whether verbally, visually, or–ideally–both. Data
scientists use a combination of their business and technical skills to investigate Big Data looking
for ways to improve current business analytics practices (from descriptive to predictive and
prescriptive) and hence to improve decisions for new business opportunities.
Diff: 2 Page Ref: 565

68) Why are some portions of tape backup workloads being redirected to Hadoop clusters today?
Answer:
• First, while it may appear inexpensive to store data on tape, the true cost comes with the
difficulty of retrieval. Not only is the data stored offline, requiring hours if not days to restore,
but tape cartridges themselves are also prone to degradation over time, making data loss a reality
and forcing companies to factor in those costs. To make matters worse, tape formats change
every couple of years, requiring organizations to either perform massive data migrations to the
newest tape format or risk the inability to restore data from obsolete tapes.
• Second, it has been shown that there is value in keeping historical data online and accessible.
As in the clickstream example, keeping raw data on a spinning disk for a longer duration makes
it easy for companies to revisit data when the context changes and new constraints need to be
applied. Searching thousands of disks with Hadoop is dramatically faster and easier than
spinning through hundreds of magnetic tapes. Additionally, as disk densities continue to double
every 18 months, it becomes economically feasible for organizations to hold many years' worth
of raw or refined data in HDFS.
Diff: 2 Page Ref: 571

69) What are the differences between stream analytics and perpetual analytics? When would you
use one or the other?
Answer:
• In many cases they are used synonymously. However, in the context of intelligent systems,
there is a difference. Streaming analytics involves applying transaction- level logic to real-time
observations. The rules applied to these observations take into account previous observations as
long as they occurred in the prescribed window; these windows have some arbitrary size (e.g.,
last 5 seconds, last 10,000 observations, etc.). Perpetual analytics, on the other hand, evaluates
every incoming observation against all prior observations, where there is no window size.
Recognizing how the new observation relates to all prior observations enables the discovery of
real-time insight.
• When transactional volumes are high and the time-to-decision is too short, favoring
nonpersistence and small window sizes, this translates into using streaming analytics. However,
when the mission is critical and transaction volumes can be managed in real time, then perpetual
analytics is a better answer.
Diff: 2 Page Ref: 582

12
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

70) Describe data stream mining and how it is used.


Answer: Data stream mining, as an enabling technology for stream analytics, is the process of
extracting novel patterns and knowledge structures from continuous, rapid data records. A data
stream is a continuous flow of ordered sequence of instances that in many applications of data
stream mining can be read/processed only once or a small number of times using limited
computing and storage capabilities. Examples of data streams include sensor data, computer
network traffic, phone conversations, ATM transactions, web searches, and financial data. Data
stream mining can be considered a subfield of data mining, machine learning, and knowledge
discovery. In many data stream mining applications, the goal is to predict the class or value of
new instances in the data stream given some knowledge about the class membership or values of
previous instances in the data stream.
Diff: 2 Page Ref: 583-584

13
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 11

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 11 Automated Decision Systems and Expert Systems

1) Rules used in automated decision systems (ADS) can be derived based on experience.
Answer: TRUE
Diff: 1 Page Ref: 471

2) Most business decision rules are the same across industries.


Answer: FALSE
Diff: 1 Page Ref: 471

3) Flight pricing systems are examples of semi-automated decision systems that require
managerial input for each decision.
Answer: FALSE
Diff: 2 Page Ref: 473

4) A revenue management (RM) system for an airline seeks to minimize each customer's ticket
price of travel on the airline's flights.
Answer: FALSE
Diff: 2 Page Ref: 474

5) Rule-based systems have their roots in artificial intelligence.


Answer: TRUE
Diff: 2 Page Ref: 475

6) Rich and Knight (1991) defined artificial intelligence as "the study of how to make people do
things at which, at the moment, computers are better."
Answer: FALSE
Diff: 2 Page Ref: 475

7) Expert systems (ES) are computer-based information systems that use expert knowledge to
attain high-level decision performance in a narrowly defined problem domain.
Answer: TRUE
Diff: 2 Page Ref: 477

8) A person's decision performance and level of knowledge are typical criteria that determine
their level of expertise in a particular subject.
Answer: TRUE
Diff: 2 Page Ref: 477

9) The basic rationale of artificial intelligence is to use mathematical calculation rather than
symbolic reasoning.
Answer: FALSE
Diff: 3 Page Ref: 478

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) While most first-generation Expert Systems (ES) use if-then rules to represent and store their
knowledge, second-generation ES are more flexible in adopting multiple knowledge
representation and reasoning methods.
Answer: TRUE
Diff: 2 Page Ref: 479

11) The case study on chemical, biological, and radiological agents shows that expert systems
are widely used in high pressure situations where the human decision makers are confident in
taking quick actions.
Answer: FALSE
Diff: 3 Page Ref: 481

12) A nonexpert uses the development environment of an expert system to obtain advice and to
solve problems using the expert knowledge embedded into the system.
Answer: FALSE
Diff: 2 Page Ref: 484

13) Knowledge acquisition from experts is a complex task that requires specialized expertise to
conduct successfully.
Answer: TRUE
Diff: 2 Page Ref: 484-485

14) The knowledge base in an expert system must correspond exactly to the format of the
knowledge base in the organization where it will be utilized.
Answer: FALSE
Diff: 3 Page Ref: 485

15) The inference engine, also known as the control structure or the rule interpreter (in rule-
based ES), is essentially a computer program that provides a methodology for reasoning about
information in the knowledge base and on the blackboard to formulate appropriate conclusions.
Answer: TRUE
Diff: 3 Page Ref: 485

16) The critical component of a knowledge refinement system is the self-learning mechanism
that allows it to adjust its knowledge base and its processing of knowledge based on the
evaluation of its recent past performances.
Answer: TRUE
Diff: 3 Page Ref: 486

17) Validation of knowledge is usually done by a human expert in the knowledge domain.
Answer: TRUE
Diff: 3 Page Ref: 488

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

18) Once validated, the knowledge acquired from experts or induced from a set of data must be
represented in a format that does not need to be understandable by humans but must be
executable on computers.
Answer: FALSE
Diff: 3 Page Ref: 490

19) Inference rules and knowledge rules are both used to solve problems in a rule-based expert
system.
Answer: TRUE
Diff: 2 Page Ref: 491

20) Unlike human experts, expert systems do not need to explain their views, recommendations,
or decisions.
Answer: FALSE
Diff: 2 Page Ref: 496

21) In the InterContinental Hotel Group case study, the mathematical model used to increase
profits was based on
A) a simulation model that tried out many options.
B) a system that collated the subjective inputs of managers.
C) a mathematical model that used two variables: price and day of the week.
D) an optimization model that used multiple variables.
Answer: D
Diff: 2 Page Ref: 470

22) Who are automated decision systems (ADS) primarily designed for?
A) strategic level managers making long-term, wide-ranging decisions
B) mid-level managers making tactical decision
C) frontline workers who must make decisions rapidly
D) operational managers who make shop floor decisions
Answer: C
Diff: 2 Page Ref: 471-473

23) In the Giant Food Stores case study, the new pricing model deployment system included all
the following features EXCEPT
A) it required more staff to make pricing changes.
B) it could handle large numbers of price changes.
C) it used point of sale and competitive data as inputs.
D) it had a predictive capability.
Answer: A
Diff: 2 Page Ref: 472

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

24) Revenue management systems modify the prices of products and services dynamically based
on
A) intuition, demand, and supply.
B) intuition, competition, and supply.
C) business rules, demand, and supply.
D) business rules, supply, and intuition.
Answer: C
Diff: 2 Page Ref: 473

25) What would explain why a divorce attorney in New York City may not be considered an
expert in Beijing, China?
A) You need a greater level of experience in Beijing to practice law.
B) No criteria to evaluate divorce attorneys exist in Beijing.
C) The divorce attorney in New York does not speak Mandarin.
D) Expertise is frequently domain dependent.
Answer: D
Diff: 3 Page Ref: 477

26) What does self-knowledge in an expert system (ES) mean?


A) An ES understands itself in a very human sense.
B) An ES understands the human decision maker.
C) The ES can explain how it reached a conclusion.
D) The ES "knows" that it exists.
Answer: D
Diff: 2 Page Ref: 478

27) How does an expert system differ from conventional systems?


A) Changes in an expert system are tedious to make.
B) The expert system operates only when it is complete.
C) Expert systems handle qualitative data easily.
D) Execution of expert system programs is algorithmic or step-by-step.
Answer: C
Diff: 2 Page Ref: 479

28) In the sport talents identification case study, the expert system was calibrated with expertise
from
A) multiple sports experts.
B) one overall sports expert.
C) ) the system developer.
D) subjects in the cases used to create the ES.
Answer: D
Diff: 3 Page Ref: 480

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

29) In the chemical, biological, and radiological agents case study, the CBR Advisor program
had all the following features EXCEPT
A) it could provide advice even incomplete information.
B) it was tailored to different types of users.
C) it was available online to the general public.
D) it was created by multiple experts.
Answer: C
Diff: 2 Page Ref: 481

30) The MYCIN Expert System was used to diagnose bacterial infections using
A) a simulation model that tried out many options.
B) a set of 500 rules on the subject.
C) an optimization model.
D) an expert system whose performance was inferior to human experts.
Answer: B
Diff: 2 Page Ref: 482

31) Which module is missing from most expert systems?


A) user interface subsystem
B) knowledge base subsystem
C) inference engine
D) knowledge refinement subsystem
Answer: D
Diff: 2 Page Ref: 484

32) All the following statements about how an expert system operates are true EXCEPT
A) incorporated knowledge is drawn exclusively from human experts.
B) a knowledge engineer creates inferencing rules.
C) knowledge rules are stored in the knowledge base.
D) inference engines contain an explanation subcomponent.
Answer: D
Diff: 2 Page Ref: 484

33) In the heart disease diagnosis case study, what was a benefit of the SIPMES expert system?
A) Expert systems from other domains were used, saving development time.
B) The SIPMES system agreed with human experts 64% of the time.
C) The SIPMES system could diagnose all types of cardiovascular diseases.
D) No human expert knowledge was needed in development, only textbook knowledge.
Answer: C
Diff: 2 Page Ref: 487

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

34) Which of the following is NOT a stage of knowledge engineering?


A) knowledge consolidation
B) knowledge representation
C) knowledge acquisition
D) knowledge validation
Answer: A
Diff: 2 Page Ref: 488

35) It is difficult to acquire knowledge from experts for all the following reasons EXCEPT
A) experts may not be able to put into words how they conduct their work.
B) testing and refining of knowledge is complex and difficult.
C) many business areas have no identifiable experts.
D) experts often change their behavior when observed.
Answer: C
Diff: 2 Page Ref: 489

36) Using certainty factors, a rule declares that IF competition is strong, CF = 70 AND margins
are above 15% CF = 100 THEN sales demand will decline. If both conditions are true, what is
the CF of the conclusion?
A) 100%
B) 70%
C) 30%
D) 21%
Answer: B
Diff: 2 Page Ref: 494-495

37) Using certainty factors, a rule declares that IF competition is strong, CF = 70 OR margins are
above 15% CF = 100 THEN sales demand will decline. If both conditions are true, what is the
CF of the conclusion?
A) 100%
B) 70%
C) 30%
D) 21%
Answer: A
Diff: 2 Page Ref: 494-495

38) Which category of expert systems that includes weather forecasting and economic/financial
forecasting?
A) diagnostic ES
B) planning ES
C) instruction ES
D) prediction ES
Answer: D
Diff: 2 Page Ref: 498

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

39) Which tool would be best to use when there is a need to very rapidly and cheaply develop a
rule-based expert system?
A) LISP or Prolog languages
B) ASP.NET
C) an ES shell
D) C++
Answer: C
Diff: 2 Page Ref: 500

40) In the Clinical Decision Support System case study, what was the system's output?
A) a diagnosis of the type of tendon injury suffered by the patient
B) a treatment and rehabilitation plan for the patient
C) an explanation of the tendon anatomy of the patient
D) a referral to specialists who could accurately diagnose the tendon injury
Answer: B
Diff: 3 Page Ref: 501-502

41) Rules derived from data ________ can be used effectively in automated decision systems.
Answer: mining
Diff: 1 Page Ref: 471

42) ________ is a collection of concepts and ideas that are related to the development of
intelligent systems.
Answer: Artificial intelligence (AI)
Diff: 1 Page Ref: 475

43) Expert systems mimic the reasoning process of ________ experts in order to solve problems.
Answer: human
Diff: 1 Page Ref: 477

44) The accumulation, transfer, and transformation of problem-solving expertise from experts or
documented knowledge sources to a computer program for constructing or expanding the
knowledge base is known as ________.
Answer: knowledge acquisition
Diff: 2 Page Ref: 484-485

45) The ability of human experts to analyze their own knowledge and its effectiveness, learn
from it, and improve on it for future consultations is known as a ________.
Answer: knowledge-refining system
Diff: 2 Page Ref: 486

46) The knowledge possessed by human experts is often lacking in ________ and not explicitly
expressed.
Answer: structure
Diff: 2 Page Ref: 488

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

47) ________ is a collection of specialized facts, procedures, and judgment usually expressed as
rules.
Answer: Knowledge
Diff: 2 Page Ref: 488

48) Knowledge rules, or ________ rules, state all the facts and relationships about a problem.
Answer: declarative
Diff: 2 Page Ref: 491

49) Inference rules, or ________ rules, offer advice on how to solve a problem, given that certain
facts are known.
Answer: procedural
Diff: 2 Page Ref: 491

50) ________ (or reasoning) is the process of using the rules in the knowledge base along with
the known facts to draw conclusions.
Answer: Inferencing
Diff: 2 Page Ref: 491

51) ________ chaining is a goal-driven approach in which you start from an expectation of what
is going to happen (i.e., hypothesis) and then seek evidence that supports (or contradicts) your
expectation.
Answer: Backward
Diff: 2 Page Ref: 491

52) ________ chaining is a data-driven approach in which we start from available information as
it becomes available or from a basic idea, and then we try to draw conclusions.
Answer: Forward
Diff: 2 Page Ref: 491

53) ________ express belief in an event (or a fact or a hypothesis) based on the expert's
assessment.
Answer: Certainty factors
Diff: 3 Page Ref: 494

54) An ________ infers situation descriptions from observations, and explains observed data by
assigning them symbolic meanings that describe the situation.
Answer: interpretation system
Diff: 2 Page Ref: 497

55) An ________ shell is a type of development tool that has built-in inference capabilities and a
user interface, and is specifically designed for ES development.
Answer: expert system
Diff: 2 Page Ref: 500

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

56) In the popular Corvid ES shell, ________ define the major factors considered in problem
solving.
Answer: variables
Diff: 3 Page Ref: 500

57) In the Corvid ES shell, ________ blocks are the decision rules acquired from experts.
Answer: logic
Diff: 2 Page Ref: 500

58) ________ blocks in the Corvid ES determine(s) how the system interacts with the user,
including the order of execution and the user interface.
Answer: Command
Diff: 3 Page Ref: 500

59) After an ES system is built, it must be evaluated in a two-step process. The first step,
________, ensures that the resulting knowledge base contains knowledge exactly the same as
that acquired from the expert.
Answer: verification
Diff: 3 Page Ref: 501

60) After an ES system is built, it must be evaluated in a two-step process. The second step,
________, ensures that the system can solve the problem correctly.
Answer: validation
Diff: 3 Page Ref: 501

61) A relatively new approach to supporting decision making is called automated decision
systems (ADS), sometimes also known as decision automation systems (DAS). Give a definition
of an ADS/DAS in simple terms?
Answer: In simple terms, An ADS is a rule-based system that provides a solution, usually in one
functional area, to a specific repetitive managerial problem, usually in one industry.
Diff: 2 Page Ref: 471

62) What are the various components of an airline revenue management system? Describe the
function of each one.
Answer:
• The pricing and accounting system: This handles ticket data, published fares, and pricing
rules.
• The aircraft scheduling system: This handles flight schedules based on customer demand.
• The inventory management system: This handles bookings, cancellations, and changes in
departure data.
Diff: 3 Page Ref: 474

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

63) Describe, with examples, the two basic ideas most experts agree that artificial intelligence
(AI) is concerned with.
Answer:
• The study of human thought processes (to understand what intelligence is)
• The representation and duplication of those thought processes in machines (e.g., computers,
robots)
Diff: 2 Page Ref: 475

64) List five disciplines of artificial intelligence.


Answer:
• Philosophy
• Human Behavior
• Neurology
• Logic
• Sociology
• Psychology
• Human Cognition
• Linguistics
• Biology
• Pattern Recognition
• Statistics
• Information Systems
• Robotics
• Management Science
• Engineering
• Computer Science
• Mathematics
Diff: 2 Page Ref: 476

65) List five applications of artificial intelligence.


Answer:
• Expert Systems
• Game Playing
• Computer Vision
• Automatic Programming
• Speech Understanding
• Autonomous Robots
• Intelligent Tutoring
• Intelligent Agents
• Natural Language Processing
• Voice Recognition
• Neural Networks
• Genetic Algorithms
• Fuzzy Logic
• Machine Learning
Diff: 2 Page Ref: 476

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

66) Describe the Turing test for determining whether a computer exhibits intelligent behavior.
Answer: According to this test, a computer can be considered smart only when a human
interviewer cannot identify the computer while conversing with both an unseen human being and
an unseen computer.
Diff: 2 Page Ref: 477

67) What are three components that may be included in an expert system in addition to the three
major components found in virtually all expert systems?
Answer:
• Knowledge acquisition subsystem
• Blackboard (workplace)
• Explanation subsystem (justifier)
• Knowledge-refining system
Diff: 2 Page Ref: 484

68) What is knowledge engineering?


Answer: Knowledge engineering is the collection of intensive activities encompassing the
acquisition of knowledge from human experts (and other information sources) and conversion of
this knowledge into a repository (commonly called a knowledge base).
Diff: 2 Page Ref: 487

69) Name and describe three problem areas suitable for expert systems.
Answer:
• Interpretation: Inferring situation descriptions from observations.
• Prediction: Inferring likely consequences of given situations.
• Diagnosis: Inferring system malfunctions from observations.
• Design: Configuring objects under constraints.
• Planning: Developing plans to achieve goals.
• Monitoring: Comparing observations to plans and flagging exceptions.
• Debugging: Prescribing remedies for malfunctions.
• Repair: Executing a plan to administer a prescribed remedy.
• Instruction: Diagnosing, debugging, and correcting student performance.
• Control: Interpreting, predicting, repairing, and monitoring system behaviors.
Diff: 2 Page Ref: 497

70) The development of expert systems is often described as a tedious process. What activities
does it typically include?
Answer:
• Identifying proper experts
• Acquiring knowledge
• Selecting the building tools
• Coding the system
• Evaluating the system
Diff: 2 Page Ref: 498

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 09

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 9 Model-Based Decision Making: Optimization and Multi-Criteria Systems

1) Modeling can be viewed as a science in its entirety.


Answer: FALSE
Diff: 2 Page Ref: 392

2) In the Midwest ISO opening vignette, the solution provided by the model's output determined
the best output level to be produced by each power plant.
Answer: TRUE
Diff: 2 Page Ref: 393

3) If linear programming can be successfully applied a problem, the output is usually optimal.
Answer: TRUE
Diff: 2 Page Ref: 394

4) In the ExxonMobil case study, the approach taken was to find individual solutions to routing,
transportation, scheduling, and inventory management, and select the best solution for one of the
variables.
Answer: FALSE
Diff: 3 Page Ref: 395

5) In order to be effective, analysts must use models to solve problems with no regard to the
organizational culture to find optimal results.
Answer: FALSE
Diff: 2 Page Ref: 396

6) In the Harrah's Cherokee Casino and Hotel case study, the revenue management system
modified room prices based on demand and offered the same price/availability to all customers at
any one time.
Answer: FALSE
Diff: 3 Page Ref: 397

7) AHP can be used effectively for optimization with problems containing a small number of
alternatives.
Answer: TRUE
Diff: 2 Page Ref: 398

8) The trend is towards developing and using Web tools and software to access and run modeling
software.
Answer: TRUE
Diff: 1 Page Ref: 399

9) Using data cubes in OLAP systems opens the data up to analysis by more classes of models.
Answer: FALSE
Diff: 3 Page Ref: 399

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Another name for result variables is independent variables.


Answer: FALSE
Diff: 2 Page Ref: 400

11) Taking a decision under risk is different from taking the decision under uncertainty.
Answer: TRUE
Diff: 2 Page Ref: 402

12) Spreadsheets are the second most popular tool for modeling.
Answer: FALSE
Diff: 1 Page Ref: 404

13) Linear programming seeks to optimally allocate resources among competing activities and is
likely the best known optimization model.
Answer: TRUE
Diff: 2 Page Ref: 408

14) When using Excel's Solver, we can have multiple constraints and multiple objective cells.
Answer: FALSE
Diff: 3 Page Ref: 410

15) Most managerial problems can be properly evaluated and solved using a single goal, such as
profit maximization.
Answer: FALSE
Diff: 3 Page Ref: 416

16) Sensitivity analysis seeks to assess the impact of changes in the input data and parameters on
the proposed solution.
Answer: TRUE
Diff: 2 Page Ref: 417

17) Goal seeking is roughly the opposite of "what-if" analysis.


Answer: TRUE
Diff: 2 Page Ref: 418

18) Using expected value (EV) with decision trees is totally appropriate for situations where one
outcome could lead to an immense loss for the company.
Answer: FALSE
Diff: 2 Page Ref: 421-422

19) In the U.S. HUD case study, the use of AHP brought standards and coherence to project
selection, resulting in a 10% decrease in project requests from 1999 levels.
Answer: FALSE
Diff: 2 Page Ref: 423

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

20) The analytic hierarchy process incorporates both qualitative and quantitative decision
making criteria.
Answer: TRUE
Diff: 2 Page Ref: 423

21) Using modeling for decision support can currently achieve all of the following EXCEPT
A) enhance the decision making process.
B) enable organizations to see likely results of their decisions.
C) replace strategy formulation at top levels of the organization.
D) reduce the costs of providing services to customers.
Answer: C
Diff: 3 Page Ref: 394

22) Environmental scanning is important for all of the following reasons EXCEPT
A) organizational culture is important and affects the model use.
B) it is critical to identify key corporate decision makers.
C) environmental factors may have created the current problem.
D) environments have greater impact on a model than the organization does.
Answer: D
Diff: 2 Page Ref: 396

23) Today, it is critical for companies to consider


A) how to get products to the right customer.
B) how to sell products at the right price.
C) how to package products in the right format.
D) all of the above
Answer: D
Diff: 3 Page Ref: 397

24) Models can be built with the help of human knowledge and expertise. Another source of help
in building these models is
A) the customer.
B) classification and clustering methods.
C) customer service reps.
D) business partners.
Answer: B
Diff: 2 Page Ref: 398

25) What is an influence diagram?


A) a diagram showing the influence of decision makers
B) a graphical representation of a model
C) a map of the environment around decision makers
D) a map of the environment around a model
Answer: B
Diff: 2 Page Ref: 399

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

26) Spreadsheets are particularly useful for all of the following reasons EXCEPT
A) they are able to import and export to many different file formats.
B) it is easy to manipulate data and see results instantly.
C) they can be used to build static and dynamic models.
D) they easily import and manipulate massive databases.
Answer: D
Diff: 3 Page Ref: 405

27) Linear programming belongs to a family of tools called


A) decision tree models.
B) qualitative models.
C) mathematical programming models.
D) heuristic programming models.
Answer: C
Diff: 2 Page Ref: 407

28) Which of the following is NOT a component of a linear programming problem?


A) internal metrics
B) constraints
C) objective function
D) decision variables
Answer: A
Diff: 2 Page Ref: 408

29) In an LP model, what does the fourth hidden component contain?


A) product mix variables
B) slack and surplus variables
C) financial and accounting variables
D) constraint and limit variables
Answer: B
Diff: 3 Page Ref: 409

30) Managers in organizations typically have


A) single goals that can be optimized using linear and nonlinear programming.
B) a small number of goals that can be independently optimized using linear and nonlinear
programming.
C) single goals that cannot be optimized using linear and nonlinear programming.
D) multiple goals that need to be simultaneously or jointly optimized.
Answer: D
Diff: 3 Page Ref: 416

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

31) Sensitivity analysis is important in management support systems for all of the following
reasons EXCEPT
A) it allows flexibility and adaptation to changing conditions.
B) it permits the manager to input data to increase his/her confidence in the model.
C) it improves the mathematical optimality of the generated solutions.
D) it provides a better understanding of the model and the decision-making situation.
Answer: C
Diff: 3 Page Ref: 417

32) The question "What will total earnings be if we reduce our inventory stocking costs by
10%?" is a type of
A) goal-seeking analysis.
B) what-if analysis.
C) sensitivity analysis.
D) utility modeling.
Answer: B
Diff: 2 Page Ref: 418

33) The question "What advertising budget is needed to increase market share by 7%?" is a type
of
A) goal-seeking analysis.
B) what-if analysis.
C) sensitivity analysis.
D) utility modeling.
Answer: A
Diff: 2 Page Ref: 418

34) The question "How many servers will be needed to reduce the waiting time of restaurant
customers to less than 9 minutes?" is a type of
A) goal-seeking analysis.
B) what-if analysis.
C) sensitivity analysis.
D) utility modeling.
Answer: A
Diff: 2 Page Ref: 418

35) Decision trees are best suited to solve what types of problems?
A) problems with a large number of alternatives
B) problems with a tabular representation
C) problems where probabilities are unknown
D) problems with a single goal
Answer: D
Diff: 2 Page Ref: 420

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

36) In handling uncertainty in decision modeling, the optimistic approach assumes


A) the best possible outcome of most alternatives will occur.
B) the best possible outcome of some alternatives will occur.
C) the best possible outcome of each alternative will occur.
D) the best possible outcome of one alternative will occur.
Answer: C
Diff: 2 Page Ref: 421

37) In handling uncertainty in decision modeling, what does the pessimistic approach do?
A) It assumes the worst possible outcome of one alternative will occur and then avoids it.
B) It assumes the worst possible outcome of some alternatives will occur and then selects the
best of them.
C) It assumes the worst possible outcome of each alternative will occur and then selects the
worst of them.
D) It assumes the worst possible outcome of each alternative will occur and then selects the best
of them.
Answer: D
Diff: 3 Page Ref: 421

38) Which of the following statements about expected utility is true?


A) It does not affect decisions made with expected values.
B) Used in decision making, it is an objective value, not subjective.
C) Used in decision making, it can bring huge risk to a small startup with limited resources.
D) In calculating utility, it assumes the decision will be made thousands of times, making the
probabilities more likely on average.
Answer: C
Diff: 2 Page Ref: 422

39) Which of the following statements about the analytic hierarchy process (AHP) is true?
A) It is really not a decision model at all.
B) It can handle multiple criteria and goals.
C) It is based entirely on quantitative data.
D) It is an opaque "black box" in the same way as neural networks.
Answer: B
Diff: 2 Page Ref: 423

40) Which of the following statements about the end-of-chapter CARE International case study
is true?
A) CARE ran its own shipping operation with vehicles that needed route optimization.
B) CARE used a linear programming model for optimization.
C) CARE's objective was to respond to natural disasters faster.
D) CARE set out to exclusively use international suppliers with large capacity to better serve
people affected by disasters.
Answer: C
Diff: 2 Page Ref: 433

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

41) In the opening vignette, Midwest ISO used optimization ________ in their problem solving.
Answer: algorithms/models
Diff: 2 Page Ref: 393

42) Identifying a model's ________ (e.g., decision and result) and their relationships is very
important in creating and using models.
Answer: variables
Diff: 2 Page Ref: 396

43) ________ models are used to predict the future and are used widely in e-commerce.
Answer: Forecasting
Diff: 1 Page Ref: 396

44) ________ modeling uses rules to determine solutions that are good enough.
Answer: Heuristic
Diff: 2 Page Ref: 398

45) In non-quantitative models, the relationships are symbolic or ________.


Answer: qualitative
Diff: 2 Page Ref: 399

46) In decision-making, fixed factors that affect the result variables but are not manipulated by
decision maker are called ________ variables.
Answer: uncontrollable
Diff: 2 Page Ref: 401

47) Deciding to purchase an FDIC-insured Certificate of Deposit at a U.S. bank can be viewed as
decision making under ________.
Answer: certainty
Diff: 2 Page Ref: 402

48) In the American Airlines case study, the modeling used for contract bidding could best be
described as decision making under ________.
Answer: risk
Diff: 2 Page Ref: 403

49) In the Fred Astaire East Side Dance Studio case study, a(n) ________ model was used to
organize ballroom showcases and arrange participants and timeslots accordingly.
Answer: scheduling
Diff: 2 Page Ref: 404-405

50) In comparison to static models, ________ models represent behavior over time.
Answer: dynamic
Diff: 2 Page Ref: 405

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

51) In the Fletcher Allen Health Care case, the ________ engine in Excel was used to find a
feasible solution to the assignment problem.
Answer: Solver
Diff: 2 Page Ref: 407

52) In mathematical programming, of available solutions, the ________ solution is the best; i.e.,
the degree of goal attainment associated with it is the highest.
Answer: optimal
Diff: 2 Page Ref: 408

53) Most quantitative models are based on solving for a single ________.
Answer: goal/objective
Diff: 2 Page Ref: 417

54) Testing the robustness of decisions under changing conditions is an example of ________
analysis.
Answer: sensitivity
Diff: 2 Page Ref: 417

55) Utility ________ is a modeling method for handling multiple goals.


Answer: theory
Diff: 2 Page Ref: 417

56) ________ seeking calculates the values of the inputs necessary to achieve a desired level of
an output.
Answer: Goal
Diff: 2 Page Ref: 418

57) ________ tables conveniently organize information and knowledge in a systematic, tabular
manner to prepare it for analysis and consideration of alternatives.
Answer: Decision
Diff: 2 Page Ref: 420

58) A decision tree can be cumbersome if there are many ________ or states of nature.
Answer: alternatives/choices
Diff: 2 Page Ref: 422

59) The analytic hierarchy process can be used to great effect to solve multi-________ problems.
Answer: criteria
Diff: 2 Page Ref: 423

60) In the end-of-chapter CARE International case study, an optimization model was used to
decide on warehouse ________.
Answer: location
Diff: 2 Page Ref: 433

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

61) Can customer relationship management (CRM) systems and revenue management systems
(RMS) recommend not selling a particular product to certain customers? If so, why; if not, why
not?
Answer: Yes, CRM and RMS can recommend ignoring certain customers or not selling a bundle
of products to a particular set of customers. Part of this effort involves identifying lifelong
customer profitability. These approaches rely heavily on forecasting techniques, which are
typically described as predictive analytics. These systems attempt to predict who their best (i.e.,
most profitable) customers (and worst ones as well) are and focus on identifying products and
services–or none at all–at appropriate prices to appeal to them.
Diff: 3 Page Ref: 397

62) List and describe four categories of models. Give examples in each category.
Answer:
• Optimization of problems with few alternatives: Find the best solution from a small
number of alternatives; e.g., decision tables, decision trees, analytic hierarchy process
• Optimization via algorithm: Find the best solution from a large number of alternatives,
using a step-by-step improvement process; e.g., linear and other mathematical programming
models, network models
• Optimization via an analytic formula: Find the best solution in one step, using a formula;
e.g., some inventory models
• Simulation: Find a good enough solution or the best among the alternatives checked, using
experimentation; e.g., Monte Carlo simulation
• Heuristics: Find a good enough solution, using rules; e.g., heuristic programming, expert
systems
• Predictive models: Predict the future for a given scenario; e.g., forecasting models, Markov
analysis
Diff: 3 Page Ref: 398

63) All quantitative models are typically made up of four basic components. List and describe
them as well as what links them together.
Answer:
1. Result (outcome) variables reflect the level of effectiveness of a system; that is, they
indicate how well the system performs or attains its goal(s). These variables are outputs.
2. Decision variables describe alternative courses of action. The decision maker controls the
decision variables.
3. Uncontrollable variables or parameters are factors that affect the result variables but are
not under the control of the decision maker. Either these factors can be fixed, in which case they
are called parameters, or they can vary, in which case they are called variables.
4. Intermediate result variables reflect intermediate outcomes in mathematical models.
Diff: 3 Page Ref: 399-401

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

64) Compare and contrast decision making under uncertainty, risk and certainty.
Answer:
• In decision making under certainty, it is assumed that complete knowledge is available so
that the decision maker knows exactly what the outcome of each course of action will be (as in a
deterministic environment).
• In decision making under uncertainty, the decision maker considers situations in which
several outcomes are possible for each course of action. In contrast to the risk situation, in this
case, the decision maker does not know, or cannot estimate, the probability of occurrence of the
possible outcomes. Decision making under uncertainty is more difficult than decision making
under certainty because there is insufficient information.
• In decision making under risk (also known as a probabilistic or stochastic decision making
situation), the decision maker must consider several possible outcomes for each alternative, each
with a given probability of occurrence.
Diff: 2 Page Ref: 402

65) List four rational economic assumptions the linear programming allocation model is based
upon.
Answer:
• Returns from different allocations can be compared; that is, they can be measured by a
common unit (e.g., dollars, utility).
• The return from any allocation is independent of other allocations.
• The total return is the sum of the returns yielded by the different activities.
• All data are known with certainty.
• The resources are to be used in the most economical manner.
Diff: 2 Page Ref: 408

66) List four difficulties that may arise when analyzing multiple goals.
Answer:
• It is usually difficult to obtain an explicit statement of the organization's goals.
• The decision maker may change the importance assigned to specific goals over time or for
different decision scenarios.
• Goals and sub-goals are viewed differently at various levels of the organization and within
different departments.
• Goals change in response to changes in the organization and its environment.
• The relationship between alternatives and their role in determining goals may be difficult to
quantify.
• Complex problems are solved by groups of decision makers, each of whom has a personal
agenda.
• Participants assess the importance (priorities) of the various goals differently.
Diff: 2 Page Ref: 417

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

67) List four things sensitivity analyses are used for.


Answer:
• Revising models to eliminate too-large sensitivities
• Adding details about sensitive variables or scenarios
• Obtaining better estimates of sensitive external variables
• Altering a real-world system to reduce actual sensitivities
• Accepting and using the sensitive (and hence vulnerable) real world, leading to the
continuous and close monitoring of actual results
Diff: 2 Page Ref: 417

68) What is the most common method for treating risk in decision trees and tables?
Answer: The most common method for handling risk in decision trees and tables is to select the
alternative with the greatest expected value.
Diff: 2 Page Ref: 421

69) How are pairwise comparisons used in the analytic hierarchy process (AHP) to select an
alternative?
Answer:
• To obtain the weights of selection criteria, the decision maker conducts pairwise comparisons
of the criteria: first criterion to second, first to third, . . ., first to last; then, second to third, . . .,
second to last; . . .; and then the next-to-last criterion to the last one. This establishes the
importance of each criterion; that is, how much of the goal's weight is distributed to each
criterion i.e., how important each criterion is.
• Beneath each criterion are the same sets of choices (alternatives) in the simple case described
here. Like the goal, the criteria decompose their weight into the choices, which capture 100
percent of the weight of each criterion. The decision maker performs a pairwise comparison of
choices in terms of preferences, as they relate to the specific criterion under consideration. Each
set of choices must be pairwise compared as they relate to each criterion. Finally, the results are
synthesized and displayed on a bar graph. The choice with the most weight is the correct choice.
Diff: 2 Page Ref: 424

70) In the CARE International end-of-chapter case study, what were the intended benefits of the
model used and the actual benefits after deployment?
Answer:
• The main purpose of the model was to increase the capacity and swiftness to respond to
sudden natural disasters like earthquakes, as opposed to other slow-occurring ones like famine.
Based on up-front cost, the model is able to provide the best optimized configuration of where to
locate a warehouse and how much inventory should be kept. It is able to provide an optimization
result based on estimates of frequency, location, and level of potential demand that is generated
by the model.
• Based on this model, CARE established three warehouses in the warehouse pre- positioning
system in Dubai, Panama, and Cambodia. In fact, during the Haiti earthquake crises in 2010,
water purification kits were supplied to the victims from the Panama warehouse. In the future,
the pre-positioning network is expected to be expanded.
Diff: 2 Page Ref: 433

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 08

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 8 Web Analytics, Web Mining, and Social Analytics

1) Participating in social media is so new that it is still optional for most companies in the United
States.
Answer: FALSE
Diff: 2 Page Ref: 342

2) Web mining is exactly the same as Web analytics: the analysis of Web site usage data.
Answer: FALSE
Diff: 2 Page Ref: 343

3) Web crawlers or spiders collect information from Web pages in an automated or semi-
automated way. Only the text of Web pages is collected by crawlers.
Answer: FALSE
Diff: 2 Page Ref: 344

4) Generally, making a search engine more efficient makes it less effective.


Answer: TRUE
Diff: 2 Page Ref: 347

5) With the PageRank algorithm, a Web page with more incoming links will always rank higher
than one with fewer incoming links.
Answer: FALSE
Diff: 3 Page Ref: 350

6) The main purpose of frequent recrawling of some Web sites is to prevent search users from
retrieving stale search results.
Answer: TRUE
Diff: 3 Page Ref: 352

7) Search engine optimization (SEO) techniques play a minor role in a Web site's search ranking
because only well-written content matters.
Answer: FALSE
Diff: 2 Page Ref: 354

8) Clickstream analysis does not need users to enter their perceptions of the Web site or other
feedback directly to be useful in determining their preferences.
Answer: TRUE
Diff: 2 Page Ref: 358

9) Having more Web traffic coming from organic search than other types of search is the goal of
most companies.
Answer: TRUE
Diff: 2 Page Ref: 363

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Since little can be done about visitor Web site abandonment rates, organizations have to
focus their efforts on increasing the number of new visitors.
Answer: FALSE
Diff: 2 Page Ref: 365

11) It is possible to use prescriptive tools for Web analytics to describe current Web site use
comprehensively.
Answer: FALSE
Diff: 2 Page Ref: 367

12) Many Web analytics tools are free to download and use, including Google Web Analytics.
Answer: TRUE
Diff: 2 Page Ref: 368

13) Voice of customer (VOC) applications track and resolve business process and usability
obstacles for a Web site.
Answer: FALSE
Diff: 2 Page Ref: 371

14) Social network analysis can help companies divide their customers into market segments by
analyzing their interconnections.
Answer: TRUE
Diff: 2 Page Ref: 377

15) Decentralization, the need for specialized skills, and immediacy of output are all attributes of
Web publishing when compared to industrial publishing.
Answer: FALSE
Diff: 2 Page Ref: 378

16) Consistent high quality, higher publishing frequency, and longer time lag are all attributes of
industrial publishing when compared to Web publishing.
Answer: FALSE
Diff: 2 Page Ref: 378

17) Web site visitors who critique and create content are more engaged than those who join
networks and spectate.
Answer: TRUE
Diff: 1 Page Ref: 379

18) Descriptive analytics for social media feature such items as your followers as well as the
content in online conversations that help you to identify themes and sentiments.
Answer: FALSE
Diff: 2 Page Ref: 381

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

19) Companies understand that when their product goes "viral," the content of the online
conversations about their product does not matter, only the volume of conversations.
Answer: FALSE
Diff: 3 Page Ref: 382

20) Social media analytics companies provide integrated support that is helpful to many parts of
a business, not only the Sales and Marketing functions.
Answer: TRUE
Diff: 1 Page Ref: 382

21) What does Web content mining involve?


A) analyzing the universal resource locator in Web pages
B) analyzing the unstructured content of Web pages
C) analyzing the pattern of visits to a Web site
D) analyzing the PageRank and other metadata of a Web page
Answer: B
Diff: 2 Page Ref: 344

22) What does Web structure mining involve?


A) analyzing the universal resource locators in Web pages
B) analyzing the unstructured content of Web pages
C) analyzing the pattern of visits to a Web site
D) analyzing the PageRank and other metadata of a Web page
Answer: A
Diff: 2 Page Ref: 344-346

23) In the extremist groups case study, what approach is used to discover the ideology and fund
raising of extremist groups through their Web sites?
A) hyperlink analysis
B) e-mail responses to questions sent to the sites
C) physical visits to addresses on the site
D) content analysis
Answer: D
Diff: 2 Page Ref: 346

24) Search engines do not search the entire Web every time a user makes a search request, for all
the following reasons EXCEPT
A) the Web is too complex to be searched each time.
B) it would take longer than the user could wait.
C) most users are not interested in searching the entire Web.
D) it is more efficient to use pre-stored search results.
Answer: C
Diff: 3 Page Ref: 348

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) Breaking up a Web page into its components to identify worthy words/terms and indexing
them using a set of rules is called
A) preprocessing the documents.
B) document analysis.
C) creating the term-by-document matrix.
D) parsing the documents.
Answer: D
Diff: 3 Page Ref: 348

26) PageRank for Webpages is useful to Web developers for which of the following reasons?
A) It gives developers insight into Web user behavior.
B) It is used in citation analysis for scholarly papers.
C) Developing many Web pages with low PageRank can help a Web site attract users.
D) They uniquely identify the Web page developer for greater accountability.
Answer: A
Diff: 3 Page Ref: 350

27) Search engine optimization (SEO) is a means by which


A) Web site developers can negotiate better deals for paid ads.
B) Web site developers can increase Web site search rankings.
C) Web site developers index their Web sites for search engines.
D) Web site developers optimize the artistic features of their Web sites.
Answer: B
Diff: 2 Page Ref: 354

28) In general, what is the best kind of Web traffic to a Web site?
A) European Web traffic
B) paid Web traffic
C) bot-generated traffic
D) organic Web traffic
Answer: D
Diff: 2 Page Ref: 356

29) Clickstream analysis is most likely to be used for all the following types of applications
EXCEPT
A) determining the lifetime value of clients.
B) hiring new functional area managers.
C) designing cross-marketing strategies across products.
D) predicting user behavior.
Answer: B
Diff: 2 Page Ref: 358

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) What are the two main types of Web analytics?


A) old-school and new-school Web analytics
B) Bing and Google Web analytics
C) off-site and on-site Web analytics
D) data-based and subjective Web analytics
Answer: C
Diff: 3 Page Ref: 359

31) Web site usability may be rated poor if


A) the average number of page views on your Web site is large.
B) the time spent on your Web site is long.
C) Web site visitors download few of your offered PDFs and videos.
D) users fail to click on all pages equally.
Answer: C
Diff: 2 Page Ref: 363

32) Common sources of traffic to your Web site include all of the following EXCEPT
A) paid search from search engines.
B) referral Web sites.
C) accidental visitors.
D) direct links.
Answer: C
Diff: 2 Page Ref: 363

33) Understanding which keywords your users enter to reach your Web site through a search
engine can help you understand
A) the hardware your Web site is running on.
B) the type of Web browser being used by your Web site visitors.
C) most of your Web site visitors' wants and needs.
D) how well visitors understand your products.
Answer: D
Diff: 3 Page Ref: 363

34) Which of the following statements about Web site conversion statistics is FALSE?
A) Web site visitors can be classed as either new or returning.
B) Visitors who begin a purchase on most Web sites must complete it.
C) The conversion rate is the number of people who take action divided by the number of
visitors.
D) Analyzing exit rates can tell you why visitors left your Web site.
Answer: B
Diff: 3 Page Ref: 364-365

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) A voice of customer (VOC) strategy involves all of the following EXCEPT
A) connecting captured insights to unstructured data in order to take action.
B) capturing both unstructured Web data and enterprise data as a starting point.
C) analyzing unstructured data with minimal effort on the user's part.
D) taking actions related to your market, customers and services.
Answer: A
Diff: 3 Page Ref: 372-373

36) All of the following statements about social networks are true EXCEPT
A) it is possible to gain insights into how products go viral.
B) companies should invest equally to retain all members of a group.
C) members of a group are affected by the behavior of others in the group.
D) a group with all interconnected individuals is called a clique.
Answer: B
Diff: 2 Page Ref: 375-377

37) What is one major way that Web-based social media is the same as publishing media?
A) They cost the same to publish.
B) They have the same immediacy of updates.
C) They require the same skill and training to publish.
D) They can both reach a global audience.
Answer: D
Diff: 3 Page Ref: 378

38) What is one major way in which Web-based social media differs from traditional publishing
media?
A) Most Web-based media are operated by the government and large firms.
B) They use different languages of publication.
C) They have different costs to own and operate.
D) Web-based media have a narrower range of quality.
Answer: C
Diff: 3 Page Ref: 378

39) What does descriptive analytics for social media do?


A) It helps identify your followers.
B) It identifies links between groups.
C) It examines the content of online conversations.
D) It identifies the biggest sources of influence online.
Answer: A
Diff: 2 Page Ref: 381

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) What does advanced analytics for social media do?


A) It helps identify your followers.
B) It identifies links between groups.
C) It examines the content of online conversations.
D) It identifies the biggest sources of influence online.
Answer: C
Diff: 2 Page Ref: 381

41) The ________ is perhaps the world's largest data and text repository, and the amount of
information on it is growing rapidly.
Answer: Web
Diff: 1 Page Ref: 342

42) Web pages contain both unstructured information and ________, which are connections to
other Web pages.
Answer: hyperlinks
Diff: 1 Page Ref: 342

43) Web ________ involves discovering relationships from Web pages.


Answer: mining
Diff: 2 Page Ref: 344

44) Web ________ are used to automatically read through the contents of Web sites.
Answer: crawlers/spiders
Diff: 1 Page Ref: 344

45) A(n) ________ is one or more Web pages that provide a collection of links to authoritative
Web pages.
Answer: hub
Diff: 1 Page Ref: 345

46) A(n) ________ engine is a software program that searches for Web sites or files based on
keywords.
Answer: search
Diff: 1 Page Ref: 347

47) In the IGN case, IGN Entertainment used search engine optimization to increase their search
engine rankings and thereby their ________ search engine traffic.
Answer: organic
Diff: 1 Page Ref: 354

48) ________ is far and away the most popular search engine.
Answer: Google
Diff: 2 Page Ref: 355

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

49) In the Lotte.com retail case, the company deployed SAS for Customer Experience Analytics
to better understand the quality of customer traffic on their Web site, classify order rates, and see
which ________ had the most visitors.
Answer: channels
Diff: 2 Page Ref: 357

50) ________ Web analytics refers to measurement and analysis of data relating to your
company that takes place outside your Web site.
Answer: Off-site
Diff: 1 Page Ref: 359

51) Analyzing server ________ files is the traditional way to collect Web site information for
on-site Web analytics.
Answer: log
Diff: 2 Page Ref: 360

52) A low number of ________ views may be the result of poor Web site design.
Answer: page
Diff: 2 Page Ref: 362

53) A ________ Web site contains links that send traffic directly to your Web site.
Answer: referral
Diff: 2 Page Ref: 363

54) ________ statistics help you understand whether your specific marketing objective for a
Web page is being achieved.
Answer: Conversion
Diff: 1 Page Ref: 364

55) Google Web ________ generates detailed statistics about a Web site's traffic and traffic
sources and tracks conversions.
Answer: Analytics
Diff: 1 Page Ref: 368

56) Social networks consist of nodes, representing individuals or organizations and ________,
which relate them.
Answer: ties/connections
Diff: 1 Page Ref: 375

57) In the Social Network Analysis (SNA) for Telecommunications case, SNA can be used to
detect ________, i.e., those visitors who about to leave the Web site and persuade them to stay
with you.
Answer: churners
Diff: 2 Page Ref: 375-376

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

58) ________ is a connections metric for social networks that measures the ties that actors in a
network have with others that are geographically close.
Answer: Propinquity
Diff: 1 Page Ref: 376

59) ________ is a segmentation metric for social networks that measures the strength of the
bonds between actors in a social network.
Answer: Cohesion
Diff: 1 Page Ref: 377

60) Advanced ________ examine the content in online conversations to identify themes,
sentiments, and connections.
Answer: analytics
Diff: 1 Page Ref: 381

61) In what ways does the Web pose great challenges for effective and efficient knowledge
discovery through data mining?
Answer:
• The Web is too big for effective data mining. The Web is so large and growing so rapidly
that it is difficult to even quantify its size. Because of the sheer size of the Web, it is not feasible
to set up a data warehouse to replicate, store, and integrate all of the data on the Web, making
data collection and integration a challenge.
• The Web is too complex. The complexity of a Web page is far greater than a page in a
traditional text document collection. Web pages lack a unified structure. They contain far more
authoring style and content variation than any set of books, articles, or other traditional text-
based document.
• The Web is too dynamic. The Web is a highly dynamic information source. Not only does
the Web grow rapidly, but its content is constantly being updated. Blogs, news stories, stock
market results, weather reports, sports scores, prices, company advertisements, and numerous
other types of information are updated regularly on the Web.
• The Web is not specific to a domain. The Web serves a broad diversity of communities and
connects billions of workstations. Web users have very different backgrounds, interests, and
usage purposes. Most users may not have good knowledge of the structure of the information
network and may not be aware of the heavy cost of a particular search that they perform.
• The Web has everything. Only a small portion of the information on the Web is truly
relevant or useful to someone (or some task). Finding the portion of the Web that is truly relevant
to a person and the task being performed is a prominent issue in Web-related research.
Diff: 2 Page Ref: 342-343

62) What is a Web crawler and what function does it serve in a search engine?
Answer: A Web crawler (also called a spider or a Web spider) is a piece of software that
systematically browses (crawls through) the World Wide Web for the purpose of finding and
fetching Web pages. Often Web crawlers copy all the pages they visit for later processing by
other functions of a search engine.
Diff: 2 Page Ref: 348

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

63) What is search engine optimization (SEO) and why is it important for organizations that own
Web sites?
Answer: Search engine optimization (SEO) is the intentional activity of affecting the visibility
of an e-commerce site or a Web site in a search engine's natural (unpaid or organic) search
results. In general, the higher ranked on the search results page, and more frequently a site
appears in the search results list, the more visitors it will receive from the search engine's users.
Being indexed by search engines like Google, Bing, and Yahoo! is not good enough for
businesses. Getting ranked on the most wide used search engines and getting ranked higher than
your competitors are what make the difference.
Diff: 3 Page Ref: 354-355

64) What is the difference between white hat and black hat SEO activities?
Answer: An SEO technique is considered white hat if it conforms to the search engines'
guidelines and involves no deception. Because search engine guidelines are not written as a
series of rules or commandments, this is an important distinction to note. White-hat SEO is not
just about following guidelines, but about ensuring that the content a search engine indexes and
subsequently ranks is the same content a user will see.
Black-hat SEO attempts to improve rankings in ways that are disapproved by the search
engines, or involve deception or trying to trick search engine algorithms from their intended
purpose.
Diff: 3 Page Ref: 355-356

65) How would you define clickstream analysis?


Answer: Clickstream analysis is the analysis of information collected by Web servers to help
companies understand user behavior better. By using the data and text mining techniques,
companies can frequently discern interesting patterns from the clickstreams.
Data collected from clickstreams include user data, session data, which pages they viewed
and when and how often they visited. Knowledge extracted from clickstreams includes usage
patterns, user profiles, page profiles, visit profiles and customer value.
Diff: 3 Page Ref: 358

66) Why are the users' page views and time spent on your Web site important metrics?
Answer: If people come to your Web site and don't view many pages, that is undesirable and
your Web site may have issues with its design or structure. Another explanation for low page
views is a disconnect in the marketing messages that brought them to the site and the content that
is actually available.
Generally, the longer a person spends on your Web site, the better it is. That could mean
they're carefully reviewing your content, utilizing interactive components you have available,
and building toward an informed decision to buy, respond, or take the next step you've provided.
On the contrary, the time on site also needs to be examined against the number of pages viewed
to make sure the visitor isn't spending his or her time trying to locate content that should be more
readily accessible.
Diff: 3 Page Ref: 362-363

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

67) How is a conversion defined on an organization's Web site? Give examples.


Answer: Each organization defines a "conversion" according to its specific marketing
objectives. Some Web analytics programs use the term "goal" to benchmark certain Web site
objectives, such as a certain number of visitors to a page, a completed registration form, or an
online purchase.
Diff: 2 Page Ref: 364

68) What is the Voice of the customer (VOC) strategy? List and describe its 4 steps.
Answer: Voice of the customer (VOC) is a term usually used to describe the analytic process of
capturing a customer's expectations, preferences, and aversions. It essentially is a market
research technique that produces a detailed set of customer wants and needs, organized into a
hierarchical structure, and then prioritized in terms of relative importance and satisfaction with
current alternatives.
• Listen encompasses both the capability to listen to the open Web (forums, blogs, tweets, you
name it) and the capability to seamlessly access enterprise information (CRM notes, documents,
e-mails, etc.).
• Analyze This is taking all of the unstructured data and making sense of it. Solutions include
keyword, statistical, and natural language approaches that will allow you to essentially tag or
barcode every word and the relationships among words, making it data that can be accessed,
searched, routed, counted, analyzed, charted, reported on, and even reused.
• Relate After finding insights and analyzing unstructured data, here you connect those
insights to your "structured" data about your customers, products, parts, locations and so on.
• Act In this step, you act on the new customer insight you've obtained.
Diff: 2 Page Ref: 372-373

69) What are the three categories of social media analytics technologies and what do they do?
Answer:
• Descriptive analytics: Uses simple statistics to identify activity characteristics and trends,
such as how many followers you have, how many reviews were generated on Facebook, and
which channels are being used most often.
• Social network analysis: Follows the links between friends, fans, and followers to identify
connections of influence as well as the biggest sources of influence.
• Advanced analytics: Includes predictive analytics and text analytics that examine the
content in online conversations to identify themes, sentiments, and connections that would not be
revealed by casual surveillance.
Diff: 2 Page Ref: 381

70) In social network analysis, who are your most powerful influencers and why are they
important?
Answer: Your most important influencers are the ones who influence the whole realm of
conversation about your topic. You need to understand whether they are saying nice things,
expressing support, or simply making observations or critiquing. What is the nature of their
conversations? How is my brand being positioned relative to the competition in that space?
Diff: 2 Page Ref: 382

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 07

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 7 Text Analytics, Text Mining, and Sentiment Analysis

1) In the chapter's opening vignette, IBM's computer named Watson outperformed human game
champions on the game show Jeopardy!
Answer: TRUE
Diff: 2 Page Ref: 290

2) Text analytics is the subset of text mining that handles information retrieval and extraction,
plus data mining.
Answer: FALSE
Diff: 2 Page Ref: 292

3) In text mining, inputs to the process include unstructured data such as Word documents, PDF
files, text excerpts, e-mail and XML files.
Answer: TRUE
Diff: 2 Page Ref: 293

4) During information extraction, entity recognition (the recognition of names of people and
organizations) takes place after relationship extraction.
Answer: FALSE
Diff: 2 Page Ref: 293

5) Categorization and clustering of documents during text mining differ only in the preselection
of categories.
Answer: TRUE
Diff: 2 Page Ref: 293

6) Articles and auxiliary verbs are assigned little value in text mining and are usually filtered out.
Answer: TRUE
Diff: 2 Page Ref: 294

7) In the patent analysis case study, text mining of thousands of patents held by the firm and its
competitors helped improve competitive intelligence, but was of little use in identifying
complementary products.
Answer: FALSE
Diff: 2 Page Ref: 295

8) The bag-of-words model is appropriate for spam detection but not for text analytics.
Answer: TRUE
Diff: 2 Page Ref: 296-297

9) Chinese, Japanese, and Thai have features that make them more difficult candidates for
natural language processing.
Answer: TRUE
Diff: 2 Page Ref: 297

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Regional accents present challenges for natural language processing.


Answer: TRUE
Diff: 2 Page Ref: 297

11) In the Hong Kong government case study, reporting time was the main benefit of using SAS
Business Analytics to generate reports.
Answer: TRUE
Diff: 2 Page Ref: 299

12) Detecting lies from text transcripts of conversations is a future goal of text mining as current
systems achieve only 50% accuracy of detection.
Answer: FALSE
Diff: 2 Page Ref: 301

13) In the financial services firm case study, text analysis for associate-customer interactions
were completely automated and could detect whether they met the company's standards.
Answer: TRUE
Diff: 2 Page Ref: 306

14) In text mining, creating the term-document matrix includes all the terms that are included in
all documents, making for huge matrices only manageable on computers.
Answer: FALSE
Diff: 2 Page Ref: 310

15) In text mining, if an association between two concepts has 7% support, it means that 7% of
the documents had both concepts represented in the same document.
Answer: TRUE
Diff: 2 Page Ref: 313

16) In sentiment analysis, sentiment suggests a transient, temporary opinion reflective of one's
feelings.
Answer: FALSE
Diff: 2 Page Ref: 320

17) Current use of sentiment analysis in voice of the customer applications allows companies to
change their products or services in real time in response to customer sentiment.
Answer: TRUE
Diff: 2 Page Ref: 323

18) In sentiment analysis, it is hard to classify some subjects such as news as good or bad, but
easier to classify others, e.g., movie reviews, in the same way.
Answer: TRUE
Diff: 2 Page Ref: 326

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

19) The linguistic approach to speech handles processes elements such as intensity, pitch and
jitter from speech recorded on audio.
Answer: FALSE
Diff: 2 Page Ref: 330

20) In the BBVA case study, text analytics was used to help the company defend and enhance its
reputation in social media.
Answer: TRUE
Diff: 2 Page Ref: 336

21) In the opening vignette, the architectural system that supported Watson used all the
following elements EXCEPT
A) massive parallelism to enable simultaneous consideration of multiple hypotheses.
B) an underlying confidence subsystem that ranks and integrates answers.
C) a core engine that could operate seamlessly in another domain without changes.
D) integration of shallow and deep knowledge.
Answer: C
Diff: 3 Page Ref: 290

22) According to a study by Merrill Lynch and Gartner, what percentage of all corporate data is
captured and stored in some sort of unstructured form?
A) 15%
B) 75%
C) 25%
D) 85%
Answer: D
Diff: 2 Page Ref: 291

23) Which of these applications will derive the LEAST benefit from text mining?
A) patients' medical files
B) patent description files
C) sales transaction files
D) customer comment files
Answer: C
Diff: 3 Page Ref: 293

24) In text mining, stemming is the process of


A) categorizing a block of text in a sentence.
B) reducing multiple words to their base or root.
C) transforming the term-by-document matrix to a manageable size.
D) creating new branches or stems of recorded paragraphs.
Answer: B
Diff: 2 Page Ref: 294

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) In text mining, tokenizing is the process of


A) categorizing a block of text in a sentence.
B) reducing multiple words to their base or root.
C) transforming the term-by-document matrix to a manageable size.
D) creating new branches or stems of recorded paragraphs.
Answer: A
Diff: 2 Page Ref: 294

26) All of the following are challenges associated with natural language processing EXCEPT
A) dividing up a text into individual words in English.
B) understanding the context in which something is said.
C) distinguishing between words that have more than one meaning.
D) recognizing typographical or grammatical errors in texts.
Answer: A
Diff: 3 Page Ref: 297

27) What application is MOST dependent on text analysis of transcribed sales call center notes
and voice conversations with customers?
A) finance
B) OLAP
C) CRM
D) ERP
Answer: C
Diff: 3 Page Ref: 301

28) In text mining, which of the following methods is NOT used to reduce the size of a sparse
matrix?
A) using a domain expert
B) normalizing word frequencies
C) using singular value decomposition
D) eliminating rarely occurring terms
Answer: B
Diff: 3 Page Ref: 311

29) What data discovery process, whereby objects are categorized into predetermined groups, is
used in text mining?
A) clustering
B) association
C) classification
D) trend analysis
Answer: C
Diff: 2 Page Ref: 312-313

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) In the research literature case study, the researchers analyzing academic papers extracted
information from which source?
A) the paper abstract
B) the paper keywords
C) the main body of the paper
D) the paper references
Answer: A
Diff: 1 Page Ref: 314

31) Sentiment classification usually covers all the following issues EXCEPT
A) classes of sentiment (e.g., positive versus negative).
B) range of polarity (e.g., star ratings for hotels and for restaurants).
C) range in strength of opinion.
D) biometric identification of the consumer expressing the sentiment.
Answer: D
Diff: 2 Page Ref: 320

32) In sentiment analysis, which of the following is an implicit opinion?


A) The hotel we stayed in was terrible.
B) The customer service I got for my TV was laughable.
C) The cruise we went on last summer was a disaster.
D) Our new mayor is great for the city.
Answer: B
Diff: 3 Page Ref: 320

33) In the Whirlpool case study, the company sought to better understand information coming
from which source?
A) customer transaction data
B) delivery information
C) customer e-mails
D) goods moving through the internal supply chain
Answer: C
Diff: 2 Page Ref: 322

34) What do voice of the market (VOM) applications of sentiment analysis do?
A) They examine customer sentiment at the aggregate level.
B) They examine employee sentiment in the organization.
C) They examine the stock market for trends.
D) They examine the "market of ideas" in politics.
Answer: A
Diff: 3 Page Ref: 323

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) How is objectivity handled in sentiment analysis?


A) It is ignored because it does not appear in customer sentiment.
B) It is incorporated as a type of sentiment.
C) It is clarified with the customer who expressed it.
D) It is identified and removed as facts are not sentiment.
Answer: D
Diff: 3 Page Ref: 325

36) Identifying the target of an expressed sentiment is difficult for all the following reasons
EXCEPT
A) the review may not be directly connected to the target through the topic name.
B) blogs and articles with the sentiment may be general in nature.
C) strong sentiments may be generated by a computer, not a person.
D) sometimes there are multiple targets expressed in a sentiment.
Answer: C
Diff: 3 Page Ref: 326

37) In text analysis, what is a lexicon?


A) a catalog of words, their synonyms, and their meanings
B) a catalog of customers, their words, and phrase
C) a catalog of letters, words, phrases and sentences
D) a catalog of customers, products, words, and phrase
Answer: A
Diff: 3 Page Ref: 327

38) What types of documents are BEST suited to semantic labeling and aggregation to determine
sentiment orientation?
A) medium- to large-sized documents
B) small- to medium-sized documents
C) large-sized documents
D) collections of documents
Answer: B
Diff: 3 Page Ref: 328

39) Inputs to speech analytics include all of the following EXCEPT


A) written transcripts of calls to service centers.
B) recorded conversations of customer call-ins.
C) live customer interactions with service representatives.
D) videos of customer focus groups.
Answer: A
Diff: 2 Page Ref: 329

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) In the Blue Cross Blue Shield case study, speech analytics were used to identify "confusion"
calls by customers. What was true about these calls?
A) They took less time than others as frustrated customers hung up.
B) They led customers to rely more on self-serve options.
C) They were not documented by customer service reps for speech analytics.
D) They were difficult to identify using standard phrases like "I don't get it."
Answer: C
Diff: 3 Page Ref: 332

41) IBM's Watson utilizes a massively parallel, text mining—focused, probabilistic evidence-
based computational architecture called ________.
Answer: DeepQA
Diff: 2 Page Ref: 290

42) ________ is probably the most often used form of information extraction.
Answer: Named entity extraction
Diff: 2 Page Ref: 293

43) ________, also called homonyms, are syntactically identical words with different meanings.
Answer: Polysemes
Diff: 2 Page Ref: 294

44) When a word has more than one meaning, selecting the meaning that makes the most sense
can only be accomplished by taking into account the context within which the word is used. This
concept is known as ________.
Answer: word sense disambiguation
Diff: 3 Page Ref: 297

45) ________ is a technique used to detect favorable and unfavorable opinions toward specific
products and services using large numbers of textual data sources.
Answer: Sentiment analysis
Diff: 2 Page Ref: 298

46) In the text mining system developed by Ghani et al., treating products as sets of ________
rather than as atomic entities can potentially boost the effectiveness of many business
applications.
Answer: attribute-value pairs
Diff: 3 Page Ref: 301

47) In the Mining for Lies case study, a text based deception-detection method used by Fuller
and others in 2008 was based on a process known as ________, which relies on elements of data
and text mining techniques.
Answer: message-feature mining
Diff: 2 Page Ref: 302

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

48) At a very high level, the text mining process can be broken down into three consecutive
tasks, the first of which is to establish the ________.
Answer: Corpus
Diff: 2 Page Ref: 308

49) Because the term-document matrix is often very large and rather sparse, an important
optimization step is to reduce the ________ of the matrix.
Answer: dimensionality
Diff: 2 Page Ref: 311

50) Where ________ appears in text, it comes in two flavors: explicit, where the subjective
sentence directly expresses an opinion, and implicit, where the text implies an opinion.
Answer: sentiment
Diff: 2 Page Ref: 320

51) ________ is mostly driven by sentiment analysis and is a key element of customer
experience management initiatives, where the goal is to create an intimate relationship with the
customer.
Answer: Voice of the customer (VOC)
Diff: 2 Page Ref: 323

52) ________ focuses on listening to social media where anyone can post opinions that can
damage or boost your reputation.
Answer: Brand management
Diff: 2 Page Ref: 323

53) In sensitivity analysis, the task of differentiating between a fact and an opinion can also be
characterized as calculation of ________ polarity.
Answer: Objectivity-Subjectivity (OS)
Diff: 3 Page Ref: 325

54) When identifying the polarity of text, the most granular level for polarity identification is at
the ________ level.
Answer: word
Diff: 1 Page Ref: 326

55) When viewed as a binary feature, ________ classification is the binary classification task of
labeling an opinionated document as expressing either an overall positive or an overall negative
opinion.
Answer: polarity
Diff: 2 Page Ref: 326

56) When labeling each term in the WordNet lexical database, the group of cognitive synonyms
(or synset) to which this term belongs is classified using a set of ________, each of which is
capable of deciding whether the synset is Positive, or Negative, or Objective.
Answer: ternary classifiers
Diff: 3 Page Ref: 327
8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

57) In automated sentiment analysis, two primary methods have been deployed to predict
sentiment within audio: acoustic/phonetic and ________ modeling.
Answer: linguistic
Diff: 2 Page Ref: 329

58) The time-demanding and laborious process of the ________ approach makes it impractical
for use with live audio streams.
Answer: acoustic/phonetic
Diff: 2 Page Ref: 330

59) ________ models operate on the premise that, when in a charged state, a speaker has a
higher probability of using specific words, exclamations, or phrases in a particular order.
Answer: Linguistic
Diff: 2 Page Ref: 330

60) Among the significant advantages associated with the ________ approach to linguistic
modeling is the method's ability to maintain a high degree of accuracy no matter what the quality
of the audio source, and its incorporation of conversational context through the use of structured
queries.
Answer: phonetic indexing and search
Diff: 2 Page Ref: 331

61) When IBM Research began looking for a major research challenge to rival the scientific and
popular interest of Deep Blue, the computer chess-playing champion, what was the company's
goal?
Answer: The goal was to advance computer science by exploring new ways for computer
technology to affect science, business, and society, and which would also have clear relevance to
IBM business interests.
Diff: 2 Page Ref: 289

62) What is the definition of text analytics according to the experts in the field?
Answer: Text analytics is a broader concept that includes information retrieval as well as
information extraction, data mining, and Web mining.
Diff: 2 Page Ref: 292

63) How would you describe information extraction in text mining?


Answer: Information extraction is the identification of key phrases and relationships within text
by looking for predefined objects and sequences in text by way of pattern matching.
Diff: 2 Page Ref: 293

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

64) Natural language processing (NLP), a subfield of artificial intelligence and computational
linguistics, is an important component of text mining. What is the definition of NLP?
Answer: NLP is a discipline that studies the problem of "understanding" the natural human
language, with the view of converting depictions of human language into more formal
representations in the form of numeric and symbolic data that are easier for computer programs
to manipulate.
Diff: 2 Page Ref: 297

65) In the security domain, one of the largest and most prominent text mining applications is the
highly classified ECHELON surveillance system. What is ECHELON assumed to be capable of
doing?
Answer: Identifying the content of telephone calls, faxes, e-mails, and other types of data and
intercepting information sent via satellites, public switched telephone networks, and microwave
links
Diff: 2 Page Ref: 301

66) Describe the query-specific clustering method as it relates to clustering.


Answer: This method employs a hierarchical clustering approach where the most relevant
documents to the posed query appear in small tight clusters that are nested in larger clusters
containing less similar documents, creating a spectrum of relevance levels among the documents.
Diff: 3 Page Ref: 313

67) Name and briefly describe four of the most popular commercial software tools used for text
mining.
Answer:
• ClearForest offers text analysis and visualization tools.
• IBM offers SPSS Modeler and data and text analytics toolkits.
• Megaputer Text Analyst offers semantic analysis of free-form text, summarization,
clustering, navigation, and natural language retrieval with search dynamic refocusing.
• SAS Text Miner provides a rich suite of text processing and analysis tools.
• KXEN Text Coder (KTC) offers a text analytics solution for automatically preparing and
transforming unstructured text attributes into a structured representation for use in KXEN
Analytic Framework.
• The Statistica Text Mining engine provides easy-to-use text mining functionality with
exceptional visualization capabilities.
• VantagePoint provides a variety of interactive graphical views and analysis tools with
powerful capabilities to discover knowledge from text databases.
• The WordStat analysis module from Provalis Research analyzes textual information such as
responses to open-ended questions, interviews, etc.
• Clarabridge text mining software provides end-to-end solutions for customer experience
professionals wishing to transform customer feedback for marketing, service, and product
improvements.
Diff: 3 Page Ref: 317

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

68) Sentiment analysis has many names. Which other names is it often known by?
Answer: Sentiment analysis is often referred to as opinion mining, subjectivity analysis, and
appraisal extraction.
Diff: 2 Page Ref: 320

69) Identify, with a brief description, each of the four steps in the sentiment analysis process.
Answer:
1. Sentiment Detection: Here the goal is to differentiate between a fact and an opinion, which
may be viewed as classification of text as objective or subjective.
2. N-P Polarity Classification: Given an opinionated piece of text, the goal is to classify the
opinion as falling under one of two opposing sentiment polarities, or locate its position on the
continuum between these two polarities.
3. Target Identification: The goal of this step is to accurately identify the target of the expressed
sentiment.
4. Collection and Aggregation: In this step all text data points in the document are aggregated
and converted to a single sentiment measure for the whole document.
Diff: 2 Page Ref: 325-326

70) Within the context of speech analytics, what does the linguistic approach focus on?
Answer: The linguistic approach focuses on the explicit indications of sentiment and context of
the spoken content within the audio.
Diff: 2 Page Ref: 330

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 06

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 6 Techniques for Predictive Modeling

1) In the opening vignette, the high accuracy of the models in predicting the outcomes of
complex medical procedures showed that data mining tools are ready to replace experts in the
medical field.
Answer: FALSE
Diff: 2 Page Ref: 246

2) Though useful in business applications, neural networks are a rough, inexact model of how the
brain works, not a precise replica.
Answer: TRUE
Diff: 2 Page Ref: 247

3) The use of hidden layers and new topologies and algorithms renewed waning interest in neural
networks.
Answer: TRUE
Diff: 2 Page Ref: 248

4) Compared to the human brain, artificial neural networks have many more neurons.
Answer: FALSE
Diff: 2 Page Ref: 250

5) In the mining industry case study, the input to the neural network is a verbal description of a
hanging rock on the mine wall.
Answer: FALSE
Diff: 2 Page Ref: 250-251

6) The network topology that allows only one-way links between layers, with no feedback
linkage permitted, is known as backpropagation.
Answer: TRUE
Diff: 1 Page Ref: 251

7) With a neural network, outputs are attributes of the problem while inputs are potential
solutions to the problem.
Answer: FALSE
Diff: 2 Page Ref: 252

8) The most complex problems solved by neural networks require one or more hidden layers for
increased accuracy.
Answer: TRUE
Diff: 1 Page Ref: 254

9) The task undertaken by a neural network does not affect the architecture of the neural
network; in other words, architectures are problem-independent.
Answer: FALSE
Diff: 2 Page Ref: 256
1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Prior to starting the development of a neural network, developers must carry out a
requirements analysis.
Answer: TRUE
Diff: 2 Page Ref: 258

11) No matter the topology or architecture of a neural network, they all use the same algorithm to
adjust weights during training.
Answer: FALSE
Diff: 2 Page Ref: 260

12) Neural networks are called "black boxes" due to the lack of ability to explain their reasoning.
Answer: TRUE
Diff: 2 Page Ref: 262-263

13) Generally speaking, support vector machines are less accurate a prediction method than other
approaches such as decision trees and neural networks.
Answer: FALSE
Diff: 2 Page Ref: 265

14) Unlike other "black box" predictive models, support vector machines have a solid
mathematical foundation in statistics.
Answer: TRUE
Diff: 2 Page Ref: 265

15) In the student retention case study, support vector machines used in prediction had
proportionally more true positives than true negatives.
Answer: TRUE
Diff: 3 Page Ref: 270

16) Using support vector machines, you must normalize the data before you numericize it.
Answer: FALSE
Diff: 2 Page Ref: 273

17) The k-nearest neighbor algorithm is overly complex when compared to artificial neural
networks and support vector machines.
Answer: FALSE
Diff: 2 Page Ref: 275

18) The k-nearest neighbor algorithm appears well-suited to solving image recognition and
categorization problems.
Answer: TRUE
Diff: 2 Page Ref: 278

19) In the Coors case study, a neural network was used to more skillfully identify which beer
flavors could be predicted.
Answer: TRUE
Diff: 2 Page Ref: 285
2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

20) In the Coors case study, genetic algorithms were of little use in solving the flavor prediction
problem.
Answer: FALSE
Diff: 3 Page Ref: 285

21) In the opening vignette, predictive modeling is described as


A) estimating the future using the past.
B) not yet accepted in the business world.
C) the least practiced branch of data mining.
D) unable to handle complex predictive problems.
Answer: A
Diff: 3 Page Ref: 246

22) In the opening vignette, which method was the best in both accuracy of predicted outcomes
and sensitivity?
A) ANN
B) CART
C) C5
D) SVM
Answer: D
Diff: 3 Page Ref: 246

23) Neural networks have been described as "biologically inspired." What does this mean?
A) They are faithful to the entire process of computation in the human brain.
B) They were created to look identical to human brains.
C) They crudely model the biological makeup of the human brain.
D) They have the power to undertake every task the human brain can.
Answer: C
Diff: 2 Page Ref: 247

24) Which element in an artificial neural network roughly corresponds to a synapse in a human
brain?
A) node
B) input
C) output
D) weight
Answer: D
Diff: 2 Page Ref: 250

25) Which element in an artificial neural network roughly corresponds to a dendrite in a human
brain?
A) node
B) input
C) output
D) weight
Answer: B
Diff: 2 Page Ref: 250
3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

26) All the following statements about hidden layers in artificial neural networks are true
EXCEPT
A) hidden layers are not direct inputs or outputs.
B) more hidden layers increase required computation exponentially.
C) many top commercial ANNs forgo hidden layers completely.
D) more hidden layers include many more weights.
Answer: C
Diff: 3 Page Ref: 254

27) In developing an artificial neural network, all of the following are important reasons to pre-
select the network architecture and learning method EXCEPT
A) some configurations have better success than others with specific problems.
B) development personnel may be more experienced with certain architectures.
C) most neural networks need special purpose hardware, which may be absent.
D) some neural network software may not be available in the organization.
Answer: C
Diff: 2 Page Ref: 258

28) Backpropagation learning algorithms for neural networks are


A) the least popular algorithm due to their inaccuracy.
B) used without hidden layers for effectiveness.
C) used without a training set of data.
D) required to have error tolerance set in advance.
Answer: D
Diff: 3 Page Ref: 260

29) Why is sensitivity analysis frequently used for artificial neural networks?
A) because it is required by all major artificial neural networks
B) because some consequences of mistakes by the network might be fatal, so justification may
matter
C) because it is generally informative, although it cannot help to identify cause-and-effect
relationships among variables
D) because it provides a complete description of the inner workings of the artificial neural
network
Answer: B
Diff: 2 Page Ref: 264

30) Support vector machines are a popular machine learning technique primarily because of
A) their relative cost and superior predictive power.
B) their superior predictive power and their theoretical foundation.
C) their relative cost and relative ease of use.
D) their high effectiveness in the very few areas where they can be used.
Answer: B
Diff: 3 Page Ref: 265

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

31) In the student retention case study, which of the following variables was MOST important in
determining whether a student dropped out of college?
A) high school GPA and SAT high score math
B) college and major
C) completed credit hours and hours enrolled
D) marital status and hours enrolled
Answer: C
Diff: 2 Page Ref: 268-269

32) In the student retention case study, of the four data mining methods used, which was the
most accurate?
A) ANN
B) DT(C5)
C) SVM
D) LR
Answer: C
Diff: 2 Page Ref: 270

33) When using support vector machines, in which stage do you transform the data?
A) preprocessing the data
B) developing the model
C) experimentation
D) deploying the model
Answer: A
Diff: 2 Page Ref: 274

34) When using support vector machines, in which stage do you select the kernel type (e.g.,
RBF, Sigmoid)?
A) preprocessing the data
B) developing the model
C) experimentation
D) deploying the model
Answer: B
Diff: 2 Page Ref: 274

35) For how long do SVM models continue to be accurate and actionable?
A) for as long as the developers stay with the firm
B) for as long as management support continues to exist for the project
C) for as long as you choose to use them
D) for as long as the behavior of the domain stays the same
Answer: D
Diff: 2 Page Ref: 274

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

36) All of the following are disadvantages/limitations of the SVM technique EXCEPT
A) model building involves complex and time-demanding calculations.
B) selection of the kernel type and kernel function parameters is difficult.
C) they have high algorithmic complexity and extensive memory requirements for complex
tasks.
D) their accuracy is poor in many domains compared to neural networks.
Answer: D
Diff: 3 Page Ref: 275

37) The k-nearest neighbor machine learning algorithm (kNN) is


A) highly mathematical and computationally intensive.
B) a method that has little in common with regression.
C) regarded as a "lazy" learning method.
D) very complex in its inner workings.
Answer: C
Diff: 2 Page Ref: 275

38) Using the k-nearest neighbor machine learning algorithm for classification, larger values of k
A) sharpen the distinction between classes.
B) reduce the effect of noise on the classification.
C) increase the effect of noise on the classification.
D) do not change the effect of noise on the classification.
Answer: B
Diff: 2 Page Ref: 277

39) What is a major drawback to the basic majority voting classification in kNN?
A) It requires frequent human subjective input during computation.
B) Classes that are more clustered tend to dominate prediction.
C) Even the naive version of the algorithm is hard to implement.
D) Classes with more frequent examples tend to dominate prediction.
Answer: D
Diff: 3 Page Ref: 277-278

40) In the Coors case study, why was a genetic algorithm paired with neural networks in the
prediction of beer flavors?
A) to replace the neural network in harder cases
B) to complement the neural network by reducing the error term
C) to enhance the neural network by pre-selecting output classes for the neural network
D) to best model how the flavor of beer evolves as it ages
Answer: B
Diff: 3 Page Ref: 285

41) The opening vignette teaches us that ________ medicine is a relatively new term coined in
the healthcare arena, where the main idea is to dig deep into past experiences to discover new
and useful knowledge to improve medical and managerial procedures in healthcare.
Answer: evidence-based
Diff: 2 Page Ref: 247
6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

42) Neural computing refers to a ________ methodology for machine learning.


Answer: pattern-recognition
Diff: 2 Page Ref: 247

43) A thorough analysis of an early neural network model called the ________, which used no
hidden layer, in addition to a negative evaluation of the research potential by Minsky and Papert
in 1969, led to a diminished interest in neural networks.
Answer: perceptron
Diff: 2 Page Ref: 248

44) In a neural network, groups of neurons can be organized in a number of different ways; these
various network patterns are referred to as ________.
Answer: topologies
Diff: 1 Page Ref: 251

45) In a typical network structure of an ANN consisting of three layers–input, intermediate, and
output–the intermediate layer is called the ________ layer.
Answer: hidden
Diff: 2 Page Ref: 252

46) In an ANN, ________ express the relative strength (or mathematical value) of the input data
or the many connections that transfer data from layer to layer.
Answer: connection weights
Diff: 2 Page Ref: 253

47) Kohonen's ________ feature maps provide a way to represent multidimensional data in much
lower dimensional spaces, usually one or two dimensions.
Answer: self-organizing
Diff: 2 Page Ref: 255

48) In the power generators case study, data mining—driven software tools, including data-
driven ________ technologies with historical data, helped an energy company reduce emissions
of NOx and CO.
Answer: predictive modeling
Diff: 3 Page Ref: 257

49) The development process for an ANN application involves ________ steps.
Answer: nine
Diff: 1 Page Ref: 258

50) ________ is the most widely used supervised learning algorithm in neural computing.
Answer: Backpropagation
Diff: 2 Page Ref: 260

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

51) ________ has proved the most popular of the techniques proposed for shedding light into the
"black-box" characterization of trained neural networks.
Answer: Sensitivity analysis
Diff: 2 Page Ref: 263

52) In the formulation of the traffic accident study in the traffic case study, the five-class
prediction problem was decomposed into a number of ________ models in order to obtain the
granularity of information needed.
Answer: binary classification
Diff: 3 Page Ref: 264

53) ________ are of particular interest to modeling highly nonlinear, complex problems,
systems, and processes and use hyperplanes to separate output classes in training data.
Answer: Support vector machines (SVMs)
Diff: 2 Page Ref: 265

54) The student retention case study shows that, given sufficient data with the proper variables,
data mining techniques are capable of predicting freshman student attrition with approximately
________ percent accuracy.
Answer: 80
Diff: 2 Page Ref: 267

55) In the mathematical formulation of SVM's, the normalization and/or scaling are important
steps to guard against variables/attributes with ________ that might otherwise dominate the
classification formulae.
Answer: larger variance
Diff: 3 Page Ref: 270

56) Writing the SVM classification rule in its dual form reveals that classification is only a
function of the ________, i.e., the training data that lie on the margin.
Answer: support vectors
Diff: 3 Page Ref: 271

57) In machine learning, the ________ is a method for converting a linear classifier algorithm
into a nonlinear one by using a nonlinear function to map the original observations into a higher-
dimensional space.
Answer: kernel trick
Diff: 2 Page Ref: 272

58) Due largely to their better classification results, support vector machines (SVMs) have
recently become a popular technique for ________-type problems.
Answer: classification
Diff: 2 Page Ref: 273

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

59) Historically, the development of ANNs followed a heuristic path, with applications and
extensive experimentation preceding theory. In contrast to ANNs, the development of SVMs
involved sound ________ theory first, then implementation and experiments.
Answer: statistical learning
Diff: 2 Page Ref: 274-275

60) In the process of image recognition (or categorization), images are first transformed into a
multidimensional ________ and then, using machine-learning techniques, are categorized into a
finite number of classes.
Answer: feature space
Diff: 3 Page Ref: 278

61) Predictive modeling is perhaps the most commonly practiced branch in data mining. What
are three of the most popular predictive modeling techniques?
Answer:
• Artificial neural networks
• Support vector machines
• k-nearest neighbor
Diff: 1 Page Ref: 243

62) Why have neural networks shown much promise in many forecasting and business
classification applications?
Answer: Because of their ability to "learn" from the data, their nonparametric nature (i.e., no
rigid assumptions), and their ability to generalize
Diff: 2 Page Ref: 247

63) Each ANN is composed of a collection of neurons that are grouped into layers. One of these
layers is the hidden layer. Define the hidden layer.
Answer: A hidden layer is a layer of neurons that takes input from the previous layer and
converts those inputs into outputs for further processing.
Diff: 2 Page Ref: 252

64) How is a general Hopfield network represented architecturally?


Answer: Architecturally, a general Hopfield network is represented as a single large layer of
neurons with total interconnectivity; that is, each neuron is connected to every other neuron
within the network.
Diff: 3 Page Ref: 256

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

65) Describe the nine steps in the development process for an ANN application.
Answer:
1. Collect, organize, and format the data
2. Separate data into training, validation, and testing sets
3. Decide on a network architecture and structure
4. Select a learning algorithm
5. Set network parameters and initialize their values
6. Initialize weights and start training (and validation)
7. Stop training, freeze the network weights
8. Test the trained network
9. Deploy the network for use on unknown new cases
Diff: 3 Page Ref: 258-259

66) What are the five steps in the backpropagation learning algorithm?
Answer:
1. Initialize weights with random values and set other parameters.
2. Read in the input vector and the desired output.
3. Compute the actual output via the calculations, working forward through the layers.
4. Compute the error.
5. Change the weights by working backward from the output layer through the hidden layers.
Diff: 3 Page Ref: 261

67) Define the term sensitivity analysis as it relates to ANNs.


Answer: Sensitivity analysis is a method for extracting the cause-and-effect relationships among
the inputs and the outputs of a trained neural network model.
Diff: 2 Page Ref: 263

68) In 1992, Boser, Guyon, and Vapnik suggested a way to create nonlinear classifiers by
applying the kernel trick to maximum-margin hyperplanes. How does the resulting algorithm
differ from the original optimal hyperplane algorithm proposed by Vladimir Vapnik in 1963?
Answer: The resulting algorithm is formally similar, except that every dot product is replaced by
a nonlinear kernel function. This allows the algorithm to fit the maximum-margin hyperplane in
the transformed feature space. The transformation may be nonlinear and the transformed space
high dimensional; thus, though the classifier is a hyperplane in the high-dimensional feature
space it may be nonlinear in the original input space.
Diff: 3 Page Ref: 272

69) What are the three steps in the process-based approach to the use of support vector machines
(SVMs)?
Answer:
1. Numericizing the data
2. Normalizing the data
3. Selecting the kernel type and kernel parameters
Diff: 2 Page Ref: 273

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

70) Describe the k-nearest neighbor (kNN) data mining algorithm.


Answer: k-NN is a prediction method for classification- as well as regression-type prediction
problems. k-NN is a type of instance-based learning (or lazy learning) where the function is only
approximated locally and all computations are deferred until the actual prediction.
Diff: 2 Page Ref: 275

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 05

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 5 Data Mining

1) In the Cabela's case study, the SAS/Teradata solution enabled the direct marketer to better
identify likely customers and market to them based mostly on external data sources.
Answer: FALSE
Diff: 2 Page Ref: 187-188

2) The cost of data storage has plummeted recently, making data mining feasible for more firms.
Answer: TRUE
Diff: 2 Page Ref: 190

3) Data mining can be very useful in detecting patterns such as credit card fraud, but is of little
help in improving sales.
Answer: FALSE
Diff: 2 Page Ref: 190

4) The entire focus of the predictive analytics system in the Infinity P&C case was on detecting
and handling fraudulent claims for the company's benefit.
Answer: FALSE
Diff: 3 Page Ref: 192

5) If using a mining analogy, "knowledge mining" would be a more appropriate term than "data
mining."
Answer: TRUE
Diff: 2 Page Ref: 192

6) Data mining requires specialized data analysts to ask ad hoc questions and obtain answers
quickly from the system.
Answer: FALSE
Diff: 2 Page Ref: 194

7) Ratio data is a type of categorical data.


Answer: FALSE
Diff: 1 Page Ref: 195

8) Interval data is a type of numerical data.


Answer: TRUE
Diff: 1 Page Ref: 195

9) In the Memphis Police Department case study, predictive analytics helped to identify the best
schedule for officers in order to pay the least overtime.
Answer: FALSE
Diff: 1 Page Ref: 196

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) In data mining, classification models help in prediction.


Answer: TRUE
Diff: 2 Page Ref: 199

11) Statistics and data mining both look for data sets that are as large as possible.
Answer: FALSE
Diff: 2 Page Ref: 200

12) Using data mining on data about imports and exports can help to detect tax avoidance and
money laundering.
Answer: TRUE
Diff: 1 Page Ref: 203

13) In the cancer research case study, data mining algorithms that predict cancer survivability
with high predictive power are good replacements for medical professionals.
Answer: FALSE
Diff: 2 Page Ref: 211

14) During classification in data mining, a false positive is an occurrence classified as true by the
algorithm while being false in reality.
Answer: TRUE
Diff: 2 Page Ref: 215

15) When training a data mining model, the testing dataset is always larger than the training
dataset.
Answer: FALSE
Diff: 2 Page Ref: 215

16) When a problem has many attributes that impact the classification of different patterns,
decision trees may be a useful approach.
Answer: TRUE
Diff: 2 Page Ref: 218

17) In the 2degrees case study, the main effectiveness of the new analytics system was in
dissuading potential churners from leaving the company.
Answer: TRUE
Diff: 2 Page Ref: 222

18) Market basket analysis is a useful and entertaining way to explain data mining to a
technologically less savvy audience, but it has little business significance.
Answer: FALSE
Diff: 2 Page Ref: 224

19) The number of users of free/open source data mining software now exceeds that of users of
commercial software versions.
Answer: TRUE
Diff: 1 Page Ref: 229
2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

20) Data that is collected, stored, and analyzed in data mining is often private and personal.
There is no way to maintain individuals' privacy other than being very careful about physical
data security.
Answer: FALSE
Diff: 2 Page Ref: 234

21) In the Cabela's case study, what types of models helped the company understand the value of
customers, using a five-point scale?
A) reporting and association models
B) simulation and geographical models
C) simulation and regression models
D) clustering and association models
Answer: D
Diff: 3 Page Ref: 108

22) Understanding customers better has helped Amazon and others become more successful.
The understanding comes primarily from
A) collecting data about customers and transactions.
B) developing a philosophy that is data analytics-centric.
C) analyzing the vast data amounts routinely collected.
D) asking the customers what they want.
Answer: C
Diff: 3 Page Ref: 190

23) All of the following statements about data mining are true EXCEPT
A) the process aspect means that data mining should be a one-step process to results.
B) the novel aspect means that previously unknown patterns are discovered.
C) the potentially useful aspect means that results should lead to some business benefit.
D) the valid aspect means that the discovered patterns should hold true on new data.
Answer: A
Diff: 3 Page Ref: 193

24) What is the main reason parallel processing is sometimes used for data mining?
A) because the hardware exists in most organizations and it is available to use
B) because the most of the algorithms used for data mining require it
C) because of the massive data amounts and search efforts involved
D) because any strategic application requires parallel processing
Answer: C
Diff: 3 Page Ref: 193

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) The data field "ethnic group" can be best described as


A) nominal data.
B) interval data.
C) ordinal data.
D) ratio data.
Answer: A
Diff: 2 Page Ref: 195

26) The data field "salary" can be best described as


A) nominal data.
B) interval data.
C) ordinal data.
D) ratio data.
Answer: D
Diff: 2 Page Ref: 195

27) Which broad area of data mining applications analyzes data, forming rules to distinguish
between defined classes?
A) associations
B) visualization
C) classification
D) clustering
Answer: C
Diff: 2 Page Ref: 199

28) Which broad area of data mining applications partitions a collection of objects into natural
groupings with similar features?
A) associations
B) visualization
C) classification
D) clustering
Answer: D
Diff: 2 Page Ref: 199

29) The data mining algorithm type used for classification somewhat resembling the biological
neural networks in the human brain is
A) association rule mining.
B) cluster analysis.
C) decision trees.
D) artificial neural networks.
Answer: D
Diff: 3 Page Ref: 199

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) Identifying and preventing incorrect claim payments and fraudulent activities falls under
which type of data mining applications?
A) insurance
B) retailing and logistics
C) customer relationship management
D) computer hardware and software
Answer: A
Diff: 2 Page Ref: 202

31) All of the following statements about data mining are true EXCEPT
A) understanding the business goal is critical.
B) understanding the data, e.g., the relevant variables, is critical to success.
C) building the model takes the most time and effort.
D) data is typically preprocessed and/or cleaned before use.
Answer: C
Diff: 3 Page Ref: 205-208

32) Which data mining process/methodology is thought to be the most comprehensive, according
to kdnuggets.com rankings?
A) SEMMA
B) proprietary organizational methodologies
C) KDD Process
D) CRISP-DM
Answer: D
Diff: 2 Page Ref: 213

33) Prediction problems where the variables have numeric values are most accurately defined as
A) classifications.
B) regressions.
C) associations.
D) computations.
Answer: B
Diff: 3 Page Ref: 214

34) What does the robustness of a data mining method refer to?
A) its ability to predict the outcome of a previously unknown data set accurately
B) its speed of computation and computational costs in using the mode
C) its ability to construct a prediction model efficiently given a large amount of data
D) its ability to overcome noisy data to make somewhat accurate predictions
Answer: D
Diff: 3 Page Ref: 214

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) What does the scalability of a data mining method refer to?
A) its ability to predict the outcome of a previously unknown data set accurately
B) its speed of computation and computational costs in using the mode
C) its ability to construct a prediction model efficiently given a large amount of data
D) its ability to overcome noisy data to make somewhat accurate predictions
Answer: C
Diff: 3 Page Ref: 214

36) In estimating the accuracy of data mining (or other) classification models, the true positive
rate is
A) the ratio of correctly classified positives divided by the total positive count.
B) the ratio of correctly classified negatives divided by the total negative count.
C) the ratio of correctly classified positives divided by the sum of correctly classified positives
and incorrectly classified positives.
D) the ratio of correctly classified positives divided by the sum of correctly classified positives
and incorrectly classified negatives.
Answer: A
Diff: 2 Page Ref: 216

37) In data mining, finding an affinity of two products to be commonly together in a shopping
cart is known as
A) association rule mining.
B) cluster analysis.
C) decision trees.
D) artificial neural networks.
Answer: A
Diff: 2 Page Ref: 224

38) Third party providers of publicly available datasets protect the anonymity of the individuals
in the data set primarily by
A) asking data users to use the data ethically.
B) leaving in identifiers (e.g., name), but changing other variables.
C) removing identifiers such as names and social security numbers.
D) letting individuals in the data know their data is being accessed.
Answer: C
Diff: 3 Page Ref: 234

39) In the Target case study, why did Target send a teen maternity ads?
A) Target's analytic model confused her with an older woman with a similar name.
B) Target was sending ads to all women in a particular neighborhood.
C) Target's analytic model suggested she was pregnant based on her buying habits.
D) Target was using a special promotion that targeted all teens in her geographical area.
Answer: C
Diff: 2 Page Ref: 235-236

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) Which of the following is a data mining myth?


A) Data mining is a multistep process that requires deliberate, proactive design and use.
B) Data mining requires a separate, dedicated database.
C) The current state-of-the-art is ready to go for almost any business.
D) Newer Web-based tools enable managers of all educational levels to do data mining.
Answer: B
Diff: 2 Page Ref: 236

41) In the opening vignette, Cabela's uses SAS data mining tools to create ________ models to
optimize customer selection for all customer contacts.
Answer: predictive
Diff: 2 Page Ref: 187

42) There has been an increase in data mining to deal with global competition and customers'
more sophisticated ________ and wants.
Answer: needs
Diff: 2 Page Ref: 190

43) Knowledge extraction, pattern analysis, data archaeology, information harvesting, pattern
searching, and data dredging are all alternative names for ________.
Answer: data mining
Diff: 1 Page Ref: 192

44) Data are often buried deep within very large ________, which sometimes contain data from
several years.
Answer: databases
Diff: 1 Page Ref: 193

45) ________ represent the labels of multiple classes used to divide a variable into specific
groups, examples of which include race, sex, age group, and educational level.
Answer: Categorical data
Diff: 2 Page Ref: 194

46) In the Memphis Police Department case study, shortly after all precincts embraced Blue
CRUSH, ________ became one of the most potent weapons in the Memphis police department's
crime-fighting arsenal.
Answer: predictive analytics
Diff: 2 Page Ref: 196

47) Patterns have been manually ________ from data by humans for centuries, but the increasing
volume of data in modern times has created a need for more automatic approaches.
Answer: extracted
Diff: 2 Page Ref: 197

48) While prediction is largely experience and opinion based, ________ is data and model based.
Answer: forecasting
Diff: 2 Page Ref: 198
7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

49) Whereas ________ starts with a well-defined proposition and hypothesis, data mining starts
with a loosely defined discovery statement.
Answer: statistics
Diff: 2 Page Ref: 200

50) Customer ________ management extends traditional marketing by creating one-on-one


relationships with customers.
Answer: relationship
Diff: 2 Page Ref: 201

51) In the terrorist funding case study, an observed price ________ may be related to income tax
avoidance/evasion, money laundering, or terrorist financing.
Answer: deviation
Diff: 3 Page Ref: 204

52) Data preparation, the third step in the CRISP-DM data mining process, is more commonly
known as ________.
Answer: data preprocessing
Diff: 2 Page Ref: 206

53) The data mining in cancer research case study explains that data mining methods are capable
of extracting patterns and ________ hidden deep in large and complex medical databases.
Answer: relationships
Diff: 3 Page Ref: 210

54) Fayyad et al. (1996) defined ________ in databases as a process of using data mining
methods to find useful information and patterns in the data.
Answer: knowledge discovery
Diff: 2 Page Ref: 213

55) In ________, a classification method, the complete data set is randomly split into mutually
exclusive subsets of approximately equal size and tested multiple times on each left-out subset,
using the others as a training set.
Answer: k-fold cross-validation
Diff: 2 Page Ref: 216

56) The basic idea behind a ________ is that it recursively divides a training set until each
division consists entirely or primarily of examples from one class.
Answer: decision tree
Diff: 3 Page Ref: 218

57) As described in the 2degrees case study, a common problem in the mobile
telecommunications industry is defined by the term ________, which means customers leaving.
Answer: customer churn
Diff: 2 Page Ref: 221

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

58) Because of its successful application to retail business problems, association rule mining is
commonly called ________.
Answer: market-basket analysis
Diff: 2 Page Ref: 224

59) The ________ is the most commonly used algorithm to discover association rules. Given a
set of itemsets, the algorithm attempts to find subsets that are common to at least a minimum
number of the itemsets.
Answer: Apriori algorithm
Diff: 2 Page Ref: 226

60) One way to accomplish privacy and protection of individuals' rights when data mining is by
________ of the customer records prior to applying data mining applications, so that the records
cannot be traced to an individual.
Answer: de-identification
Diff: 2 Page Ref: 234

61) List five reasons for the growing popularity of data mining in the business world.
Answer:
• More intense competition at the global scale driven by customers' ever-changing needs and
wants in an increasingly saturated marketplace
• General recognition of the untapped value hidden in large data sources
• Consolidation and integration of database records, which enables a single view of customers,
vendors, transactions, etc.
• Consolidation of databases and other data repositories into a single location in the form of a
data warehouse
• The exponential increase in data processing and storage technologies
• Significant reduction in the cost of hardware and software for data storage and processing
• Movement toward the de-massification (conversion of information resources into
nonphysical form) of business practices
Diff: 2 Page Ref: 190

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

62) What are the differences between nominal, ordinal, interval and ratio data? Give examples.
Answer:
• Nominal data contain measurements of simple codes assigned to objects as labels, which are
not measurements. For example, the variable marital status can be generally categorized as (1)
single, (2) married, and (3) divorced.
• Ordinal data contain codes assigned to objects or events as labels that also represent the
rank order among them. For example, the variable credit score can be generally categorized as
(1) low, (2) medium, or (3) high. Similar ordered relationships can be seen in variables such as
age group (i.e., child, young, middle-aged, elderly) and educational level (i.e., high school,
college, graduate school).
• Interval data are variables that can be measured on interval scales. A common example of
interval scale measurement is temperature on the Celsius scale. In this particular scale, the unit of
measurement is 1/100 of the difference between the melting temperature and the boiling
temperature of water in atmospheric pressure; that is, there is not an absolute zero value.
• Ratio data include measurement variables commonly found in the physical sciences and
engineering. Mass, length, time, plane angle, energy, and electric charge are examples of
physical measures that are ratio scales. Informally, the distinguishing feature of a ratio scale is
the possession of a nonarbitrary zero value. For example, the Kelvin temperature scale has a
nonarbitrary zero point of absolute zero.
Diff: 2 Page Ref: 194-195

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

63) List and briefly describe the six steps of the CRISP-DM data mining process.
Answer:
• Step 1: Business Understanding - The key element of any data mining study is to know
what the study is for. Answering such a question begins with a thorough understanding of the
managerial need for new knowledge and an explicit specification of the business objective
regarding the study to be conducted.
• Step 2: Data Understanding - A data mining study is specific to addressing a well-defined
business task, and different business tasks require different sets of data. Following the business
understanding, the main activity of the data mining process is to identify the relevant data from
many available databases.
• Step 3: Data Preparation - The purpose of data preparation (or more commonly called data
preprocessing) is to take the data identified in the previous step and prepare it for analysis by
data mining methods. Compared to the other steps in CRISP-DM, data preprocessing consumes
the most time and effort; most believe that this step accounts for roughly 80 percent of the total
time spent on a data mining project.
• Step 4: Model Building - Here, various modeling techniques are selected and applied to an
already prepared data set in order to address the specific business need. The model-building step
also encompasses the assessment and comparative analysis of the various models built.
• Step 5: Testing and Evaluation - In step 5, the developed models are assessed and
evaluated for their accuracy and generality. This step assesses the degree to which the selected
model (or models) meets the business objectives and, if so, to what extent (i.e., do more models
need to be developed and assessed).
• Step 6: Deployment - Depending on the requirements, the deployment phase can be as
simple as generating a report or as complex as implementing a repeatable data mining process
across the enterprise. In many cases, it is the customer, not the data analyst, who carries out the
deployment steps.
Diff: 2 Page Ref: 205-212

64) Describe the role of the simple split in estimating the accuracy of classification models.
Answer: The simple split (or holdout or test sample estimation) partitions the data into two
mutually exclusive subsets called a training set and a test set (or holdout set). It is common to
designate two-thirds of the data as the training set and the remaining one-third as the test set. The
training set is used by the inducer (model builder), and the built classifier is then tested on the
test set. An exception to this rule occurs when the classifier is an artificial neural network. In this
case, the data is partitioned into three mutually exclusive subsets: training, validation, and
testing.
Diff: 2 Page Ref: 215

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

65) Briefly describe five techniques (or algorithms) that are used for classification modeling.
Answer:
• Decision tree analysis. Decision tree analysis (a machine-learning technique) is arguably the
most popular classification technique in the data mining arena.
• Statistical analysis. Statistical techniques were the primary classification algorithm for many
years until the emergence of machine-learning techniques. Statistical classification techniques
include logistic regression and discriminant analysis.
• Neural networks. These are among the most popular machine-learning techniques that can
be used for classification-type problems.
• Case-based reasoning. This approach uses historical cases to recognize commonalities in
order to assign a new case into the most probable category.
• Bayesian classifiers. This approach uses probability theory to build classification models
based on the past occurrences that are capable of placing a new instance into a most probable
class (or category).
• Genetic algorithms. This approach uses the analogy of natural evolution to build directed-
search-based mechanisms to classify data samples.
• Rough sets. This method takes into account the partial membership of class labels to
predefined categories in building models (collection of rules) for classification problems.
Diff: 2 Page Ref: 218

66) Describe cluster analysis and some of its applications.


Answer: Cluster analysis is an exploratory data analysis tool for solving classification problems.
The objective is to sort cases (e.g., people, things, events) into groups, or clusters, so that the
degree of association is strong among members of the same cluster and weak among members of
different clusters. Cluster analysis is an essential data mining method for classifying items,
events, or concepts into common groupings called clusters. The method is commonly used in
biology, medicine, genetics, social network analysis, anthropology, archaeology, astronomy,
character recognition, and even in MIS development. As data mining has increased in popularity,
the underlying techniques have been applied to business, especially to marketing. Cluster
analysis has been used extensively for fraud detection (both credit card and e-commerce fraud)
and market segmentation of customers in contemporary CRM systems.
Diff: 2 Page Ref: 220

67) In the data mining in Hollywood case study, how successful were the models in predicting
the success or failure of a Hollywood movie?
Answer: The researchers claim that these prediction results are better than any reported in the
published literature for this problem domain. Fusion classification methods attained up to
56.07% accuracy in correctly classifying movies and 90.75% accuracy in classifying movies
within one category of their actual category. The SVM classification method attained up to
55.49% accuracy in correctly classifying movies and 85.55% accuracy in classifying movies
within one category of their actual category.
Diff: 3 Page Ref: 233-234

12
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

68) In lessons learned from the Target case, what legal warnings would you give another retailer
using data mining for marketing?
Answer: If you look at this practice from a legal perspective, you would conclude that Target
did not use any information that violates customer privacy; rather, they used transactional data
that most every other retail chain is collecting and storing (and perhaps analyzing) about their
customers. What was disturbing in this scenario was perhaps the targeted concept: pregnancy.
There are certain events or concepts that should be off limits or treated extremely cautiously,
such as terminal disease, divorce, and bankruptcy.
Diff: 2 Page Ref: 236

69) List four myths associated with data mining.


Answer:
• Data mining provides instant, crystal-ball-like predictions.
• Data mining is not yet viable for business applications.
• Data mining requires a separate, dedicated database.
• Only those with advanced degrees can do data mining.
• Data mining is only for large firms that have lots of customer data.
Diff: 2 Page Ref: 236

70) List six common data mining mistakes.


Answer:
• Selecting the wrong problem for data mining
• Ignoring what your sponsor thinks data mining is and what it really can and cannot do
• Leaving insufficient time for data preparation
• Looking only at aggregated results and not at individual records
• Being sloppy about keeping track of the data mining procedure and results
• Ignoring suspicious findings and quickly moving on
• Running mining algorithms repeatedly and blindly
• Believing everything you are told about the data
• Believing everything you are told about your own data mining analysis
• Measuring your results differently from the way your sponsor measures them
Diff: 2 Page Ref: 236-237

13
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 04

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 4 Business Reporting, Visual Analytics, and Business Performance Management

1) The WebFOCUS BI platform in the Travel and Transport case study decreased clients'
reliance on the IT function when seeking system reports.
Answer: TRUE
Diff: 1 Page Ref: 137

2) The dashboard for the WebFOCUS BI platform in the Travel and Transport case study
required client side software to operate.
Answer: FALSE
Diff: 2 Page Ref: 138

3) Data is the contextualization of information, that is, information set in context.


Answer: FALSE
Diff: 1 Page Ref: 139

4) The main difference between service level agreements and key performance indicators is the
audience.
Answer: TRUE
Diff: 2 Page Ref: 143

5) The balanced scorecard is a type of report that is based solely on financial metrics.
Answer: FALSE
Diff: 1 Page Ref: 143

6) The data storage component of a business reporting system builds the various reports and
hosts them for, or disseminates them to users. It also provides notification, annotation,
collaboration, and other services.
Answer: FALSE
Diff: 2 Page Ref: 144

7) In the FEMA case study, the BureauNet software was the primary reason behind the increased
speed and relevance of the reports FEMA employees received.
Answer: TRUE
Diff: 2 Page Ref: 145

8) Google Maps has set new standards for data visualization with its intuitive Web mapping
software.
Answer: TRUE
Diff: 2 Page Ref: 149

9) There are basic chart types and specialized chart types. A Gantt chart is a specialized chart
type.
Answer: TRUE
Diff: 2 Page Ref: 152

1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Visualization differs from traditional charts and graphs in complexity of data sets and use of
multiple dimensions and measures.
Answer: TRUE
Diff: 2 Page Ref: 155-156

11) When telling a story during a presentation, it is best to avoid describing hurdles that your
character must overcome, to avoid souring the mood.
Answer: FALSE
Diff: 2 Page Ref: 157

12) For best results when deploying visual analytics environments, focus only on power users
and management to get the best return on your investment.
Answer: FALSE
Diff: 2 Page Ref: 158

13) Information density is a key characteristic of performance dashboards.


Answer: TRUE
Diff: 2 Page Ref: 160

14) In the Dallas Cowboys case study, the focus was on using data analytics to decide which
players would play every week.
Answer: FALSE
Diff: 2 Page Ref: 161

15) One comparison typically made when data is presented in business intelligence systems is a
comparison against historical values.
Answer: TRUE
Diff: 1 Page Ref: 162

16) The best key performance indicators are derived independently from the company's strategic
goals to enable developers to "think outside of the box."
Answer: FALSE
Diff: 3 Page Ref: 166

17) The BPM development cycle is essentially a one-shot process where the requirement is to get
it right the first time.
Answer: FALSE
Diff: 2 Page Ref: 167

18) With key performance indicators, driver KPIs have a significant effect on outcome KPIs, but
the reverse is not necessarily true.
Answer: TRUE
Diff: 2 Page Ref: 171

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

19) With the balanced scorecard approach, the entire focus is on measuring and managing
specific financial goals based on the organization's strategy.
Answer: FALSE
Diff: 2 Page Ref: 175

20) A Six Sigma deployment can be deemed effective even if the number of defects are not
reduced to 3.4 defects per million.
Answer: FALSE
Diff: 2 Page Ref: 176

21) For those executives who do not have the time to go through lengthy reports, the best
alternative is the
A) last page of the report.
B) raw data that informed the report.
C) executive summary.
D) charts in the report.
Answer: C
Diff: 2 Page Ref: 140

22) All of the following are true about external reports between businesses and the government
EXCEPT
A) they can include tax and compliance reporting.
B) they can be filed nationally or internationally.
C) they are standardized for the most part to reduce the regulatory burden.
D) their primary focus is government.
Answer: D
Diff: 2 Page Ref: 140

23) Kaplan and Norton developed a report that presents an integrated view of success in the
organization called
A) metric management reports.
B) balanced scorecard-type reports.
C) dashboard-type reports.
D) visual reports.
Answer: B
Diff: 2 Page Ref: 143

24) Which component of a reporting system contains steps detailing how recorded transactions
are converted into metrics, scorecards, and dashboards?
A) data supply
B) business logic
C) extract, transform and load
D) assurance
Answer: B
Diff: 2 Page Ref: 144

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) Which of the following is LEAST related to data/information visualization?


A) information graphics
B) scientific visualization
C) statistical graphics
D) graphic artwork
Answer: D
Diff: 2 Page Ref: 145

26) The Internet emerged as a new medium for visualization and brought all the following
EXCEPT
A) worldwide digital distribution of visualization.
B) immersive environments for consuming data.
C) new forms of computation of business logic.
D) new graphics displays through PC displays.
Answer: C
Diff: 2 Page Ref: 149

27) Which kind of chart is described as an enhanced variant of a scatter plot?


A) heat map
B) bullet
C) pie chart
D) bubble chart
Answer: D
Diff: 3 Page Ref: 151

28) Which type of visualization tool can be very helpful when the intention is to show relative
proportions of dollars per department allocated by a university administration?
A) heat map
B) bullet
C) pie chart
D) bubble chart
Answer: C
Diff: 3 Page Ref: 151

29) Which type of visualization tool can be very helpful when a data set contains location data?
A) bar chart
B) geographic map
C) highlight table
D) tree map
Answer: B
Diff: 2 Page Ref: 152

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) Which type of question does visual analytics seeks to answer?


A) Why did it happen?
B) What happened yesterday?
C) What is happening today?
D) When did it happen?
Answer: A
Diff: 2 Page Ref: 156

31) When you tell a story in a presentation, all of the following are true EXCEPT
A) a story should make sense and order out of a lot of background noise.
B) a well-told story should have no need for subsequent discussion.
C) stories and their lessons should be easy to remember.
D) the outcome and reasons for it should be clear at the end of your story.
Answer: B
Diff: 2 Page Ref: 157

32) Benefits of the latest visual analytics tools, such as SAS Visual Analytics, include all of the
following EXCEPT
A) mobile platforms such as the iPhone are supported by these products.
B) it is easier to spot useful patterns and trends in the data.
C) they explore massive amounts of data in hours, not days.
D) there is less demand on IT departments for reports.
Answer: C
Diff: 2 Page Ref: 158

33) What is the management feature of a dashboard?


A) operational data that identify what actions to take to resolve a problem
B) summarized dimensional data to analyze the root cause of problems
C) summarized dimensional data to monitor key performance metrics
D) graphical, abstracted data to monitor key performance metrics
Answer: A
Diff: 3 Page Ref: 162

34) What is the fundamental challenge of dashboard design?


A) ensuring that users across the organization have access to it
B) ensuring that the organization has the appropriate hardware onsite to support it
C) ensuring that the organization has access to the latest web browsers
D) ensuring that the required information is shown clearly on a single screen
Answer: D
Diff: 3 Page Ref: 162

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) Contextual metadata for a dashboard includes all the following EXCEPT
A) whether any high-value transactions that would skew the overall trends were rejected as a part
of the loading process.
B) which operating system is running the dashboard server software.
C) whether the dashboard is presenting "fresh" or "stale" information.
D) when the data warehouse was last refreshed.
Answer: B
Diff: 2 Page Ref: 165

36) Dashboards can be presented at all the following levels EXCEPT


A) the visual dashboard level.
B) the static report level.
C) the visual cube level.
D) the self-service cube level.
Answer: C
Diff: 2 Page Ref: 166

37) Why is a performance management system superior to a performance measurement system?


A) because performance measurement systems are only in their infancy
B) because measurement automatically leads to problem solution
C) because performance management systems cost more
D) because measurement alone has little use without action
Answer: D
Diff: 3 Page Ref: 172

38) Why is the customer perspective important in the balanced scorecard methodology?
A) because dissatisfied customers will eventually hurt the bottom line
B) because customers should always be included in any design methodology
C) because customers understand best how the firm's internal processes should work
D) because companies need customer input into the design of the balanced scorecard
Answer: A
Diff: 2 Page Ref: 173

39) All of the following statements about balanced scorecards and dashboards are true EXCEPT
A) scorecards are less preferred at operational and tactical levels.
B) dashboards would be the preferred choice to monitor production quality.
C) scorecards are best for real-time tracking of a marketing campaign.
D) scorecards are preferred for tracking the achievement of strategic goals.
Answer: C
Diff: 3 Page Ref: 175

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) What is Six Sigma?


A) a letter in the Greek alphabet that statisticians use to measure process variability
B) a methodology aimed at reducing the number of defects in a business process
C) a methodology aimed at reducing the amount of variability in a business process
D) a methodology aimed at measuring the amount of variability in a business process
Answer: B
Diff: 2 Page Ref: 176

41) A(n) ________ is a communication artifact, concerning business matters, prepared with the
specific intention of relaying information in a presentable form.
Answer: business report
Diff: 2 Page Ref: 135

42) Travel and Transport created an online BI self-service system that allowed ________ to
access information directly.
Answer: clients
Diff: 2 Page Ref: 137

43) There are only a few categories of business report: informal, ________, and short.
Answer: formal
Diff: 2 Page Ref: 140

44) In the Delta Lloyd Group case study, the ________ is the stage of the reporting process in
which consolidated figures are cited, formatted, and described to form the final text of the report.
Answer: last mile
Diff: 2 Page Ref: 142

45) ________ management reports are used to manage business performance through outcome-
oriented metrics in many organizations.
Answer: Metric
Diff: 2 Page Ref: 143

46) In the Blastrac case study, Tableau analytics software was used to replace massive ________
that were loaded with data from multiple ERP systems.
Answer: spreadsheets
Diff: 2 Page Ref: 146

47) ________ charts are useful in displaying nominal data or numerical data that splits nicely
into different categories so you can quickly see comparative results and trends.
Answer: Bar
Diff: 1 Page Ref: 151

48) ________ charts or network diagrams show precedence relationships among the project
activities/tasks.
Answer: PERT
Diff: 1 Page Ref: 152

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

49) ________ are typically used together with other charts and graphs, as opposed to by
themselves, and show postal codes, country names, etc.
Answer: Maps
Diff: 1 Page Ref: 152

50) Typical charts, graphs, and other visual elements used in visualization-based applications
usually involve ________ dimensions.
Answer: two
Diff: 2 Page Ref: 155

51) Visual analytics is widely regarded as the combination of visualization and ________
analytics.
Answer: predictive
Diff: 2 Page Ref: 156

52) Dashboards present visual displays of important information that are consolidated and
arranged on a single ________.
Answer: screen
Diff: 1 Page Ref: 160

53) With dashboards, the layer of information that uses graphical, abstracted data to keep tabs on
key performance metrics is the ________ layer.
Answer: monitoring
Diff: 2 Page Ref: 162

54) In the Saudi Telecom company case study, information ________ software allowed
managers to see trends and correct issues before they became problems.
Answer: visualization
Diff: 1 Page Ref: 163

55) Performance dashboards enable ________ operations that allow the users to view underlying
data sources and obtain more detail.
Answer: drill-down/drill-through
Diff: 2 Page Ref: 164

56) With a dashboard, information on sources of the data being presented, the quality and
currency of underlying data provide contextual ________ for users.
Answer: metadata
Diff: 2 Page Ref: 165

57) Business performance management comprises a ________ set of processes that link strategy
to execution with the goal of optimizing business performance.
Answer: closed-loop
Diff: 2 Page Ref: 167

8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

58) In the Mace case study, the IBM Cognos software enabled the rapid creation of integrated
reports across 60 countries, replacing a large and complex ________.
Answer: spreadsheet
Diff: 1 Page Ref: 169

59) A strategically aligned metric is also known as a key ________.


Answer: performance indicator
Diff: 2 Page Ref: 171

60) The ________ perspective of the organization suggested by the balanced scorecard focuses
on business processes and how well they are running.
Answer: internal business process
Diff: 2 Page Ref: 174

61) List and describe the three major categories of business reports.
Answer:
• Metric management reports. Many organizations manage business performance through
outcome-oriented metrics. For external groups, these are service-level agreements (SLAs). For
internal management, they are key performance indicators (KPIs).
• Dashboard-type reports. This report presents a range of different performance indicators on
one page, like a dashboard in a car. Typically, there is a set of predefined reports with static
elements and fixed structure, but customization of the dashboard is allowed through widgets,
views, and set targets for various metrics.
• Balanced scorecard—type reports. This is a method developed by Kaplan and Norton that
attempts to present an integrated view of success in an organization. In addition to financial
performance, balanced scorecard—type reports also include customer, business process, and
learning and growth perspectives.
Diff: 2 Page Ref: 143

62) List five types of specialized charts and graphs.


Answer:
• Histograms
• Gantt charts
• PERT charts
• Geographic maps
• Bullets
• Heat maps
• Highlight tables
• Tree maps
Diff: 2 Page Ref: 151-152

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

63) According to Eckerson (2006), a well-known expert on BI dashboards, what are the three
layers of information of a dashboard?
Answer:
1. Monitoring. Graphical, abstracted data to monitor key performance metrics.
2. Analysis. Summarized dimensional data to analyze the root cause of problems.
3. Management. Detailed operational data that identify what actions to take to resolve a
problem.
Diff: 2 Page Ref: 162

64) List five best practices of dashboard design.


Answer:
• Benchmark key performance indicators with industry standards
• Wrap the dashboard metrics with contextual metadata
• Validate the dashboard design by a usability specialist
• Prioritize and rank alerts/exceptions streamed to the dashboard
• Enrich the dashboard with business users' comments
• Present information in three different levels
• Pick the right visual construct using dashboard design principles
• Provide for guided analytics
Diff: 2 Page Ref: 165-166

65) What are the four processes that define a closed-loop BPM cycle?
Answer:
1. Strategize: This is the process of identifying and stating the organization's mission, vision,
and objectives, and developing plans (at different levels of granularity–strategic, tactical and
operational) to achieve these objectives.
2. Plan: When operational managers know and understand the what (i.e., the organizational
objectives and goals), they will be able to come up with the how (i.e., detailed operational and
financial plans). Operational and financial plans answer two questions: What tactics and
initiatives will be pursued to meet the performance targets established by the strategic plan?
What are the expected financial results of executing the tactics?
3. Monitor/Analyze: When the operational and financial plans are underway, it is imperative
that the performance of the organization be monitored. A comprehensive framework for
monitoring performance should address two key issues: what to monitor and how to monitor.
4. Act and Adjust: What do we need to do differently? Whether a company is interested in
growing its business or simply improving its operations, virtually all strategies depend on new
projects–creating new products, entering new markets, acquiring new customers or businesses, or
streamlining some processes.
The final part of this loop is taking action and adjusting current actions based on analysis of
problems and opportunities.
Diff: 2 Page Ref: 167-168

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

66) List and describe five distinguishing features of key performance indicators.
Answer:
• Strategy. KPIs embody a strategic objective.
• Targets. KPIs measure performance against specific targets. Targets are defined in strategy,
planning, or budgeting sessions and can take different forms (e.g., achievement targets, reduction
targets, absolute targets).
• Ranges. Targets have performance ranges (e.g., above, on, or below target).
• Encodings. Ranges are encoded in software, enabling the visual display of performance (e.g.,
green, yellow, red). Encodings can be based on percentages or more complex rules.
• Time frames. Targets are assigned time frames by which they must be accomplished. A time
frame is often divided into smaller intervals to provide performance mileposts.
• Benchmarks. Targets are measured against a baseline or benchmark. The previous year's
results often serve as a benchmark, but arbitrary numbers or external benchmarks may also be
used.
Diff: 2 Page Ref: 171

67) What are the three nonfinancial objectives of the balanced scorecard?
Answer:
1. Customer. This defines how the organization should appear to its customers if it is to
accomplish its vision.
2. Internal business process. This specifies the processes the organization must excel at in
order to satisfy its shareholders and customers.
3. Learning and growth. This indicates how an organization can improve its ability to change
and improve in order to achieve its vision.
Diff: 2 Page Ref: 174

68) Six Sigma rests on a simple performance improvement model known as DMAIC. What are
the steps involved?
Answer:
1. Define. Define the goals, objectives, and boundaries of the improvement activity. At the top
level, the goals are the strategic objectives of the company. At lower levels–department or
project levels–the goals are focused on specific operational processes.
2. Measure. Measure the existing system. Establish quantitative measures that will yield
statistically valid data. The data can be used to monitor progress toward the goals defined in the
previous step.
3. Analyze. Analyze the system to identify ways to eliminate the gap between the current
performance of the system or process and the desired goal.
4. Improve. Initiate actions to eliminate the gap by finding ways to do things better, cheaper, or
faster. Use project management and other planning tools to implement the new approach.
5. Control. Institutionalize the improved system by modifying compensation and incentive
systems, policies, procedures, manufacturing resource planning, budgets, operation instructions,
or other management systems.
Diff: 2 Page Ref: 176

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

69) What are the basic ingredients of a good collection of performance measures?
Answer:
• Measures should focus on key factors.
• Measures should be a mix of past, present, and future.
• Measures should balance the needs of shareholders, employees, partners, suppliers, and other
stakeholders.
• Measures should start at the top and flow down to the bottom.
• Measures need to have targets that are based on research and reality rather than be arbitrary.
Diff: 2 Page Ref: 177

70) In the Expedia case study, what three steps were taken to convert drivers of departmental
performance into a scorecard?
Answer:
• Deciding how to measure satisfaction. This required the group to determine which measures
in the 20 databases would be useful for demonstrating a customer's level of satisfaction. This
became the basis for the scorecards and KPIs.
• Setting the right performance targets. This required the group to determine whether KPI
targets had short-term or long-term payoffs.
• Putting data into context. The group had to tie the data to ongoing customer satisfaction
projects.
Diff: 2 Page Ref: 178

12
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

Sharda bia10e tif 01

Data Mining (Taibah University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by ree mur ([email protected])
lOMoARcPSD|7465459

Business Intelligence and Analytics: Systems for Decision Support, 10e (Sharda)
Chapter 1 An Overview of Business Intelligence, Analytics, and Decision Support

1) Computerized support is only used for organizational decisions that are responses to external
pressures, not for taking advantage of opportunities.
Answer: FALSE
Diff: 2 Page Ref: 5

2) The complexity of today's business environment creates many new challenges for
organizations, such as global competition, but creates few new opportunities in return.
Answer: FALSE
Diff: 2 Page Ref: 6

3) In addition to deploying business intelligence (BI) systems, companies may also perform other
actions to counter business pressures, such as improving customer service and entering business
alliances.
Answer: TRUE
Diff: 1 Page Ref: 6-7

4) The overwhelming majority of competitive actions taken by businesses today feature


computerized information system support.
Answer: TRUE
Diff: 1 Page Ref: 7

5) PCs and, increasingly, mobile devices are the most common means of providing managers
with information to directly support decision making, instead of using IT staff intermediaries.
Answer: TRUE
Diff: 2 Page Ref: 7

6) In today's business environment, creativity, intuition, and interpersonal skills are effective
substitutes for analytical decision making.
Answer: FALSE
Diff: 2 Page Ref: 8

7) In a four-step process for decision making, managers construct a model of the problem before
they evaluate potential solutions.
Answer: TRUE
Diff: 2 Page Ref: 9

8) Due to the fact that business environments are now more complex than ever, trial-and-error is
an effective means of arriving at acceptable solutions.
Answer: FALSE
Diff: 2 Page Ref: 9

9) Group collaboration software has proved generally ineffective at improving decision-making.


Answer: FALSE
Diff: 3 Page Ref: 9-10
1
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

10) Due to the fact that organizations seek to store greater amounts of data than ever before, the
cost per byte of computer-based data storage devices is rapidly rising.
Answer: FALSE
Diff: 3 Page Ref: 10

11) Computerized information systems help decision makers overcome human cognitive
limitations in assembling and processing varied information. However, this is of little use in most
analytical applications.
Answer: FALSE
Diff: 3 Page Ref: 10

12) In the Gorry and Scott-Morton framework of structured, semi-structured, and unstructured
decisions, computerized decision support can bring benefits to unstructured decisions.
Answer: TRUE
Diff: 2 Page Ref: 12

13) The term decision support system is a very specific term that implies the same tool, system,
and development approach to most developers.
Answer: FALSE
Diff: 3 Page Ref: 13

14) The access to data and ability to manipulate data (frequently including real-time data) are
key elements of business intelligence (BI) systems.
Answer: TRUE
Diff: 2 Page Ref: 14

15) One of the four components of BI systems, business performance management, is a


collection of source data in the data warehouse.
Answer: FALSE
Diff: 3 Page Ref: 15

16) Actionable intelligence is the primary goal of modern-day Business Intelligence (BI) systems
vs. historical reporting that characterized Management Information Systems (MIS).
Answer: TRUE
Diff: 3 Page Ref: 17

17) The use of dashboards and data visualizations is seldom effective in finding efficiencies in
organizations, as demonstrated by the Seattle Children's Hospital Case Study.
Answer: FALSE
Diff: 2 Page Ref: 21

18) The use of statistics in baseball by the Oakland Athletics, as described in the Moneyball case
study, is an example of the effectiveness of prescriptive analytics.
Answer: TRUE
Diff: 2 Page Ref: 23

2
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

19) Pushing programming out to distributed data is achieved solely by using the Hadoop
Distributed File System or HDFS.
Answer: FALSE
Diff: 2 Page Ref: 28

20) Volume, velocity, and variety of data characterize the Big Data paradigm.
Answer: TRUE
Diff: 2 Page Ref: 28

21) In the Magpie Sensing case study, the automated collection of temperature and humidity data
on shipped goods helped with various types of analytics. Which of the following is an example
of prescriptive analytics?
A) real time reports of the shipment's temperature
B) warning of an open shipment seal
C) location of the shipment
D) optimal temperature setting
Answer: D
Diff: 3 Page Ref: 4

22) In the Magpie Sensing case study, the automated collection of temperature and humidity data
on shipped goods helped with various types of analytics. Which of the following is an example
of predictive analytics?
A) real time reports of the shipment's temperature
B) warning of an open shipment seal
C) location of the shipment
D) optimal temperature setting
Answer: B
Diff: 3 Page Ref: 4

23) Which of the following is NOT an example that falls within the four major categories of
business environment factors for today's organizations?
A) globalization
B) increased pool of customers
C) fewer government regulations
D) increased competition
Answer: C
Diff: 2 Page Ref: 5-6

24) Organizations counter the pressures they experience in their business environments in
multiple ways. Which of the following is NOT an effective way to counter these pressures?
A) reactive actions
B) anticipative actions
C) adaptive actions
D) retroactive actions
Answer: D
Diff: 2 Page Ref: 6

3
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

25) Which of the following activities permeates nearly all managerial activity?
A) planning
B) controlling
C) directing
D) decision-making
Answer: D
Diff: 2 Page Ref: 7

26) Why are analytical decision making skills now viewed as more important than interpersonal
skills for an organization's managers?
A) because interpersonal skills are never important in organizations
B) because personable and friendly managers are always the least effective
C) because analytical-oriented managers produce better results over time
D) because analytical-oriented managers tend to be flashier and less methodical
Answer: C
Diff: 3 Page Ref: 8

27) Business environments and government requirements are becoming more complex. All of
the following actions to manage this complexity would be appropriate EXCEPT
A) hiring more sophisticated and computer-savvy managers.
B) deploying more sophisticated tools and technique.
C) seeking new ways to avoid government compliance.
D) avoiding expensive trial and error to find out what works.
Answer: C
Diff: 2 Page Ref: 9

28) The deployment of large data warehouses with terabytes or even petabytes of data been
crucial to the growth of decision support. All the following explain why EXCEPT
A) data warehouses have enabled the affordable collection of data for analytics.
B) data warehouses have enabled the collection of decision makers in one place.
C) data warehouses have assisted the collection of data for data mining.
D) data warehouses have assisted the collection of data from multiple sources.
Answer: B
Diff: 2 Page Ref: 10

29) Which of the following statements about cognitive limits of organizational decision makers
is true?
A) Only top managers make decisions where cognitive limits are strained.
B) The most talented and effective managers do not have cognitive limitations.
C) All organizational decision-making requires data beyond human cognitive limits.
D) Cognitive limits affect both the recall and use of data by decision makers.
Answer: D
Diff: 2 Page Ref: 10

4
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

30) For the majority of organizations, evaluating the credit rating of a potential business partner
is a(n)
A) strategic decision.
B) structured decision.
C) unstructured decision.
D) managerial control decision.
Answer: D
Diff: 2 Page Ref: 11

31) For the majority of organizations, a daily accounts receivable transaction is a(n)
A) strategic decision.
B) structured decision.
C) unstructured decision.
D) managerial control decision.
Answer: B
Diff: 2 Page Ref: 11

32) All of the following may be viewed as decision support systems EXCEPT
A) an expert system to diagnose a medical condition.
B) a knowledge management system to guide decision makers.
C) a system that helps to manage the organization's supply chain management.
D) a retail sales system that processes customer sales transactions.
Answer: D
Diff: 2 Page Ref: 13

33) Business intelligence (BI) can be characterized as a transformation of


A) data to information to decisions to actions.
B) Big Data to data to information to decisions.
C) actions to decisions to feedback to information.
D) data to processing to information to actions.
Answer: A
Diff: 3 Page Ref: 14

34) In answering the question "Which customers are most likely to click on my online ads and
purchase my goods?", you are most likely to use which of the following analytic applications?
A) customer profitability
B) propensity to buy
C) customer attrition
D) channel optimization
Answer: B
Diff: 3 Page Ref: 17

5
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

35) In answering the question "Which customers are likely to be using fake credit cards?", you
are most likely to use which of the following analytic applications?
A) channel optimization
B) customer segmentation
C) fraud detection
D) customer profitability
Answer: C
Diff: 3 Page Ref: 17

36) When Sabre developed their Enterprise Data Warehouse, they chose to use near-real-time
updating of their database. The main reason they did so was
A) to provide a 360 degree view of the organization.
B) to aggregate performance metrics in an understandable way.
C) to be able to assess internal operations.
D) to provide up-to-date executive insights.
Answer: D
Diff: 3 Page Ref: 17

37) How are descriptive analytics methods different from the other two types?
A) They answer "what-if?" queries, not "how many?" queries.
B) They answer "what-is?" queries, not "what will be?" queries.
C) They answer "what to do?" queries, not "what-if?" queries.
D) They answer "what will be?" queries, not "what to do?" queries.
Answer: B
Diff: 3 Page Ref: 20-24

38) Prescriptive BI capabilities are viewed as more powerful than predictive ones for all the
following reasons EXCEPT
A) prescriptive BI gives actual guidance as to actions.
B) understanding the likelihood of certain events often leaves unclear remedies.
C) only prescriptive BI capabilities have monetary value to top-level managers.
D) prescriptive models generally build on (with some overlap) predictive ones.
Answer: C
Diff: 3 Page Ref: 24-25

39) Which of the following statements about Big Data is true?


A) Data chunks are stored in different locations on one computer.
B) Hadoop is a type of processor used to process Big Data applications.
C) MapReduce is a storage filing system.
D) Pure Big Data systems do not involve fault tolerance.
Answer: D
Diff: 3 Page Ref: 28

6
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

40) Big Data often involves a form of distributed storage and processing using Hadoop and
MapReduce. One reason for this is
A) centralized storage creates too many vulnerabilities.
B) the "Big" in Big Data necessitates over 10,000 processing nodes.
C) the processing power needed for the centralized model would overload a single computer.
D) Big Data systems have to match the geographical spread of social media.
Answer: C
Diff: 3 Page Ref: 28

41) The desire by a customer to customize a product falls under the ________ category of
business environment factors.
Answer: consumer demand
Diff: 2 Page Ref: 6

42) An older and more diverse workforce falls under the ________ category of business
environment factors.
Answer: societal
Diff: 2 Page Ref: 6

43) Organizations using BI systems are typically seeking to ________ the gap between the
organization's current and desired performance.
Answer: close
Diff: 2 Page Ref: 7

44) Mintzberg defines the ________ as a managerial role that involves searching the
environment for new opportunities.
Answer: entrepreneur
Diff: 2 Page Ref: 8

45) Group communication and ________ involves decision makers who are likely to be in
different locations.
Answer: collaboration
Diff: 2 Page Ref: 10

46) ________ technology enables managers to access and analyze information anytime and from
anyplace.
Answer: Wireless
Diff: 2 Page Ref: 10

47) A(n) ________ problem such as setting budgets for products is one that has some structured
elements and some unstructured elements also.
Answer: semistructured
Diff: 2 Page Ref: 12

7
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

48) A(n) ________ problem such as new technology development is one that has very few
structured elements.
Answer: unstructured
Diff: 2 Page Ref: 12

49) ________ is an umbrella term that combines architectures, tools, databases, analytical tools,
applications, and methodologies.
Answer: Business intelligence (BI)
Diff: 2 Page Ref: 14

50) A(n) ________ is a major component of a Business Intelligence (BI) system that holds
source data.
Answer: data warehouse
Diff: 2 Page Ref: 15

51) A(n) ________ is a major component of a Business Intelligence (BI) system that is usually
browser based and often presents a portal or dashboard.
Answer: user interface
Diff: 2 Page Ref: 15

52) ________ cycle times are now extremely compressed, faster, and more informed across
industries.
Answer: Business
Diff: 2 Page Ref: 16

53) The fraud ________ analytic application helps determine fraudulent events and take action.
Answer: detection
Diff: 2 Page Ref: 17

54) Sabre used executive ________ to present performance metrics in a concise way to its
executives.
Answer: dashboards
Diff: 2 Page Ref: 17

55) ________ analytics help managers understand current events in the organization including
causes, trends, and patterns.
Answer: Descriptive
Diff: 2 Page Ref: 20

56) ________ analytics help managers understand probable future outcomes.


Answer: Predictive
Diff: 2 Page Ref: 22

57) ________ analytics help managers make decisions to achieve the best performance in the
future.
Answer: Prescriptive
Diff: 2 Page Ref: 24
8
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

58) The Google search engine is an example of Big Data in that it has to search and index
billions of ________ in fractions of a second for each search.
Answer: web pages
Diff: 2 Page Ref: 27-28

59) The filing system developed by Google to handle Big Data storage challenges is known as
the ________ Distributed File System.
Answer: Hadoop
Diff: 2 Page Ref: 28

60) The programming algorithm developed by Google to handle Big Data computational
challenges is known as ________.
Answer: MapReduce
Diff: 2 Page Ref: 28

61) The environment in which organizations operate today is becoming more and more complex.
Business environment factors can be divided into four major categories. What are these
categories?
Answer:
• Markets
• Consumer demands
• Technology
• Societal
Diff: 2 Page Ref: 6

62) List four of Mintzberg's Decisional roles of managers.


Answer:
• Entrepreneur: Searches the organization and its environment for opportunities and initiates
improvement projects to bring about change; supervises design of certain projects
• Disturbance handler: Is responsible for corrective action when the organization faces
important, unexpected disturbances
• Resource allocator: Is responsible for the allocation of organizational resources of all kinds;
in effect, is responsible for the making or approval of all significant organizational decisions
• Negotiator: Is responsible for representing the organization at major negotiations
Diff: 2 Page Ref: 8

63) Managers usually make decisions by following a four-step process. What are the steps?
Answer:
1. Define the problem (i.e., a decision situation that may deal with some difficulty or with an
opportunity).
2. Construct a model that describes the real-world problem.
3. Identify possible solutions to the modeled problem and evaluate the solutions.
4. Compare, choose, and recommend a potential solution to the problem.
Diff: 2 Page Ref: 9

9
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

64) List three developments that have contributed to facilitating growth of decision support and
analytics.
Answer:
• Group communication and collaboration
• Improved data management
• Managing giant data warehouses and Big Data
• Analytical support
• Overcoming cognitive limits in processing and storing information
• Knowledge management
• Anywhere, anytime support
Diff: 2 Page Ref: 10

65) Describe the types of computer support that can be used for structured, semistructured, and
unstructured decisions.
Answer:
• Structured Decisions: Structured problems, which are encountered repeatedly, have a high
level of structure. It is therefore possible to abstract, analyze, and classify them into specific
categories and use a scientific approach for automating portions of this type of managerial
decision making.
• Semistructured Decisions: Semistructured problems may involve a combination of standard
solution procedures and human judgment. Management science can provide models for the
portion of a decision-making problem that is structured. For the unstructured portion, a DSS can
improve the quality of the information on which the decision is based by providing, for example,
not only a single solution but also a range of alternative solutions, along with their potential
impacts.
• Unstructured Decisions: These can be only partially supported by standard computerized
quantitative methods. It is usually necessary to develop customized solutions. However, such
solutions may benefit from data and information generated from corporate or external data
sources.
Diff: 2 Page Ref: 12-13

66) What are the four major components of a Business Intelligence (BI) system?
Answer:
1. A data warehouse, with its source data;
2. Business analytics, a collection of tools for manipulating, mining, and analyzing the data in
the data warehouse;
3. Business performance management (BPM) for monitoring and analyzing performance; and
4. A user interface (e.g., a dashboard).
Diff: 3 Page Ref: 15

10
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])


lOMoARcPSD|7465459

67) List and describe three levels or categories of analytics that are most often viewed as
sequential and independent, but also occasionally seen as overlapping.
Answer:
• Descriptive or reporting analytics refers to knowing what is happening in the organization
and understanding some underlying trends and causes of such occurrences.
• Predictive analytics aims to determine what is likely to happen in the future. This analysis is
based on statistical techniques as well as other more recently developed techniques that fall
under the general category of data mining.
• Prescriptive analytics recognizes what is going on as well as the likely forecast and make
decisions to achieve the best performance possible.
Diff: 3 Page Ref: 20-24

68) How does Amazon.com use predictive analytics to respond to product searches by the
customer?
Answer: Amazon uses clustering algorithms to segment customers into different clusters to be
able to target specific promotions to them. The company also uses association mining techniques
to estimate relationships between different purchasing behaviors. That is, if a customer buys one
product, what else is the customer likely to purchase? That helps Amazon recommend or
promote related products. For example, any product search on Amazon.com results in the retailer
also suggesting other similar products that may interest a customer.
Diff: 3 Page Ref: 22-23

69) Describe and define Big Data. Why is a search engine a Big Data application?
Answer:
• Big Data is data that cannot be stored in a single storage unit. Big Data typically refers to data
that is arriving in many different forms, be they structured, unstructured, or in a stream. Major
sources of such data are clickstreams from Web sites, postings on social media sites such as
Facebook, or data from traffic, sensors, or weather.
• A Web search engine such as Google needs to search and index billions of Web pages in
order to give you relevant search results in a fraction of a second. Although this is not done in
real time, generating an index of all the Web pages on the Internet is not an easy task.
Diff: 3 Page Ref: 27-28

70) What storage system and processing algorithm were developed by Google for Big Data?
Answer:
• Google developed and released as an Apache project the Hadoop Distributed File System
(HDFS) for storing large amounts of data in a distributed way.
• Google developed and released as an Apache project the MapReduce algorithm for pushing
computation to the data, instead of pushing data to a computing node.
Diff: 3 Page Ref: 28

11
Copyright © 2015 Pearson Education, Inc.

Downloaded by ree mur ([email protected])

You might also like