Chapter 3
Chapter 3
2) When we refer to data, we are only referring to information that is well structured.
Answer: FALSE
Diff: 2 Page Ref: 121
7) The OLAP branch of descriptive analytics has also been called business intelligence.
Answer: TRUE
Diff: 2 Page Ref: 139
9) A time series is a sequence of data points of the variable of interest, measured and represented
at successive points in time spaced at uniform time intervals.
Answer: TRUE
Diff: 2 Page Ref: 156
10) The linearity assumption in a regression model states that the errors of the response variable
are uncorrelated with each other.
Answer: FALSE
Diff: 2 Page Ref: 154
11) A data dashboard is any communication artifact prepared with the specific intention of
conveying information in a digestible form to whoever needs it whenever and wherever.
Answer: FALSE
Diff: 2 Page Ref: 163
12) There has been an increase in the use of computing power to produce unified reports.
Answer: TRUE
Diff: 2 Page Ref: 164
14) Data visualization is closely related to the fields of DW management and MIS application
development.
Answer: FALSE
Diff: 2 Page Ref: 166
15) The Gantt chart (also called a network diagram) is developed primarily to simplify the
planning and scheduling of large and complex projects.
Answer: FALSE
Diff: 3 Page Ref: 173
16) Some charts or graphs are better at answering certain types of questions.
Answer: TRUE
Diff: 1 Page Ref: 171
17) There is a growing number of data visualization techniques being used to better portray
business results.
Answer: TRUE
Diff: 2 Page Ref: 176
18) According to Eckerson the most distinctive feature of a dashboard is its three layers of
information.
Answer: TRUE
Diff: 1 Page Ref: 185
19) Dashboards are not a new concept and their roots can be traced at least to the executive
information system of the 1980s.
Answer: TRUE
Diff: 2 Page Ref: 184
22) To satisfy this requirement, this data has variables that are defined at the lowest (or as low as
required) level of detail for the intended use.
A) data source reliability
B) data accessibility
C) data granularity
D) data currency
Answer: C
Diff: 2 Page Ref: 124
23) These contain codes assigned to objects or events as labels that also represent the rank order
among them.
A) numeric data
B) ordinal data
C) interval data
D) nominal data
Answer: B
Diff: 2 Page Ref: 126
24) These contain measurements of simple codes assigned to objects as labels, which are not
measurements.
A) numeric data
B) ordinal data
C) interval data
D) nominal data
Answer: D
Diff: 2 Page Ref: 125
25) Eliminating duplicate data is typically a part of which data preprocessing step?
A) Data Consolidation
B) Data Cleaning
C) Data Transformation
D) Data Reduction
Answer: B
Diff: 2 Page Ref: 130
26) Reducing a data set's volume can be a portion of which data preprocessing step?
A) Data Consolidation
B) Data Cleaning
C) Data Transformation
D) Data Reduction
Answer: D
Diff: 2 Page Ref: 130
27) What measure is used to characterize the peak/tall/skinny nature of the distribution?
A) skewness
B) standard deviation
C) whisker plot
D) kurtosis
Answer: D
Diff: 3 Page Ref: 146
29) This assumption in a regression analysis states that the explanatory variables are not
correlated.
A) linearity
B) multicollinearity
C) independence
D) normality
Answer: B
Diff: 2 Page Ref: 155
30) This assumption in a regression analysis states that the relationship between the response
variable and the explanatory variables is linear.
A) linearity
B) multicollinearity
C) independence
D) normality
Answer: A
Diff: 2 Page Ref: 154
31) This type of report presents an integrated view of success in an organization.
A) metric management report
B) dashboard report
C) balanced scorecard report
D) none of these
Answer: C
Diff: 2 Page Ref: 165
32) This type of report may include color-coded traffic lights for different performance levels.
A) metric management report
B) dashboard report
C) balanced scorecard report
D) none of these
Answer: B
Diff: 2 Page Ref: 165
34) What made the digital distribution of both data and visualization more accessible to a
broader audience?
A) the printing press
B) the pony express
C) personal computers
D) the Internet
Answer: D
Diff: 3 Page Ref: 168
35) This figure is often used to explore the relationship between two or three variables (in 2D or
3D visuals).
A) line chart
B) bar chart
C) pie chart
D) scatter plot
Answer: D
Diff: 2 Page Ref: 172
36) This figure portrays project timelines, project tasks/activity durations, and overlap among the
tasks/activities.
A) PERT chart
B) Gantt chart
C) histogram
D) bubble chart
Answer: B
Diff: 2 Page Ref: 173
39) This layer of dashboard information allows for of key performance metrics.
A) monitoring
B) analysis
C) management
D) export
Answer: A
Diff: 2 Page Ref: 185
41) Automated data collection systems are not only enabling businesses to collect more volumes
of data but also enhancing the data quality and ________.
Answer: integrity
Diff: 2 Page Ref: 121
42) Making data ________ for prediction means that data sets must be transformed into a flat-
file format and made ready for ingestion into those predictive algorithms.
Answer: analytics ready
Diff: 2 Page Ref: 122
43) The ________ data can be subdivided into nominal or ordinal data.
Answer: categorical
Diff: 2 Page Ref: 125
44) ________ data include measurement variables commonly found in the physical sciences and
engineering.
Answer: Ratio
Diff: 3 Page Ref: 126
45) In data reduction, reducing the number of variables is referred to as ________ reduction.
Answer: dimensional
Diff: 2 Page Ref: 131
46) Data are ________ between a certain minimum and maximum for all variables to mitigate
the potential bias.
Answer: normalized
Diff: 3 Page Ref: 130
47) Measures of ________ are the mathematical methods used to estimate or describe the degree
of variation in a given variable of interest.
Answer: dispersion
Diff: 2 Page Ref: 142
50) ________ makes no a priori assumption of whether one variable is dependent on the other(s)
and is not concerned with the relationship between variables.
Answer: Correlation
Diff: 2 Page Ref: 151
51) ________ are typically enterprise-wide agreed upon targets to be tracked against over a
period of time.
Answer: Key performance indicators or KPIs
Diff: 3 Page Ref: 165
52) Key to any successful report are ________, brevity, completeness, and correctness.
Answer: clarity
Diff: 2 Page Ref: 164
53) ________ has also single-handedly democratized both the interface conventions and the
technology for displaying interactive geography online
Answer: Google Maps
Diff: 2 Page Ref: 168
56) ________ are used to show the frequency distribution of one variable or several variables.
Answer: Histograms
Diff: 2 Page Ref: 172
57) When presenting your data analysis it is often helpful to view your analysis as a data rich
________.
Answer: story
Diff: 2 Page Ref: 179
58) An ideal dashboard would provide ________ to underlying data sources or reports, providing
more detail about the underlying comparative and evaluative context.
Answer: drill-down or drill-through
Diff: 2 Page Ref: 187
59) Specialized display ________ allow easy visual comparison of information with a minimum
of set up when using a dashboard.
Answer: widgets
Diff: 2 Page Ref: 185
60) An ideal dashboard requires little, if any, customized ________ to implement, deploy, and
maintain.
Answer: coding
Diff: 2 Page Ref: 187
61) Select and discuss one of the best practices in dashboard design.
Answer: Student selections will vary, but they will discuss one of the following best practices:
• Benchmark Key Performance Indicators with Industry Standards
• Wrap the Dashboard Metrics with Contextual Metadata
• Validate the Dashboard Design by a Usability Specialist
• Prioritize and Rank Alerts/Exceptions Streamed to the Dashboard
• Enrich the Dashboard with Business-User Comments
• Present Information in Three Different Levels
• Pick the Right Visual Construct Using Dashboard Design Principles
• Provide for Guided Analytics
Diff: 2 Page Ref: 187-8
66) List and briefly define the central tendency measures of descriptive statistics.
Answer:
• The arithmetic mean (or simply mean or average) is the sum of all the values/observations
divided by the number of observations in the data set.
• The median is the measure of center value in a given data set.
• The mode is the observation that occurs most frequently.
Diff: 2 Page Ref: 140-141
69) List the five assumptions made in linear regressions and select one to discuss in depth.
Answer:
1. Linearity. This assumption states that the relationship between the response variable and the
explanatory variables is linear. That is, the expected value of the response variable is a straight-
line function of each explanatory variable while holding all other explanatory variables fixed.
Also, the slope of the line does not depend on the values of the other variables. It also implies
that the effects of different explanatory variables on the expected value of the response variable
are additive in nature.
2. Independence (of errors). This assumption states that the errors of the response variable are
uncorrelated with each other. This independence of the errors is weaker
than actual statistical independence, which is a stronger condition and is often not needed for
linear regression analysis.
3. Normality (of errors). This assumption states that the errors of the response variable are
normally distributed. That is, they are supposed to be totally random and should not represent
any nonrandom patterns.
4. Constant variance (of errors). This assumption, also called homoscedasticity, states that the
response variables have the same variance in their error regardless of the values of the
explanatory variables. In practice, this assumption is invalid if the response variable varies over a
wide enough range/scale.
5. Multicollinearity. This assumption states that the explanatory variables are not correlated (i.e.,
do not replicate the same but provide a different perspective of the information needed for the
model). Multicollinearity can be triggered by having two or more perfectly correlated
explanatory variables presented to the model (e.g., if the same explanatory variable is mistakenly
included in the model twice, one with a slight transformation of the same variable). A
correlation-based data assessment usually catches this error.
Diff: 2 Page Ref: 154-5
70) Describe the three main types of business reports.
Answer: Metric management Reports - In many organizations, business performance is
managed through outcome-oriented metrics. For external groups, these are service-level
agreements. For internal management, they are key performance indicators (KPIs). Typically,
there are enterprise-wide agreed upon targets to be tracked against over a period of time. They
can be used as part of other management strategies such as Six Sigma or total quality
management.
Dashboard-Type Reports - A popular idea in business reporting in recent years has been to
present a range of different performance indicators on one page like a dashboard in a car.
Typically, dashboard vendors would provide a set of predefined reports with static elements and
fixed structure but also allow for customization of the dashboard widgets, views, and set targets
for various metrics. It is common to have color-coded traffic lights defined for performance (red,
orange, green) to draw management's attention to particular areas. A more detailed description of
dashboards can be found in a later section of this chapter.
Balanced Scorecard Reports - This is a method developed by Kaplan and Norton that attempts to
present an integrated view of success in an organization. In addition to financial performance,
balanced scorecard—type reports also include customer, business process, and learning and
growth perspectives. More details on balanced scorecards are provided in a later section in this
chapter.
Diff: 2 Page Ref: 165