0% found this document useful (0 votes)
7 views11 pages

Data Visualization

The document presents a series of data interpretation practice problems, including bar charts, box plots, pie charts, and frequency distributions. Each problem includes a scenario, questions, and detailed solutions, ranging from basic calculations to more complex analyses of data distributions. The exercises aim to enhance understanding of data summary and interpretation skills.

Uploaded by

genadds2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Data Visualization

The document presents a series of data interpretation practice problems, including bar charts, box plots, pie charts, and frequency distributions. Each problem includes a scenario, questions, and detailed solutions, ranging from basic calculations to more complex analyses of data distributions. The exercises aim to enhance understanding of data summary and interpretation skills.

Uploaded by

genadds2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Data Interpretation Practice 1

Practice Problem 1: Bar Chart Analysis

Scenario:
A bar chart shows the annual revenue (in millions) for three companies over four years:

Company
Annual A for
Revenue Company B Company
Companies A, B, andCC
60
Revenue (millions)

40

20

0
2019 2020 2021 2022

Questions:

(a) Which company had the highest revenue in 2021?

(b) Calculate the absolute revenue growth for Company A from 2019 to 2022.

Solution
Solution:
(a) In 2021, the revenues are: A = 35, B = 30, C = 18. Therefore, Company A had
the highest revenue.
(b) Company A grew from 20 (2019) to 40 (2022), an increase of 40 − 20 = 20 million.
Data Interpretation Practice 2

Practice Problem 2: Box Plot Analysis

Scenario:
A box plot displays daily sales (in units) for two shops over a month.

Daily Sales Distribution

60
Daily Sales (units)

50

40

Shop X Shop Y

Questions:

(a) Which shop has a higher median, and by how many units?

(b) What is the IQR (Interquartile Range) for each shop?

(c) Which shop has a larger overall range?

Solution
Solution:
(a) Shop X median = 45; Shop Y median = 50; the difference is 50 − 45 = 5 units.
(b) IQR for Shop X = 50 − 40 = 10; IQR for Shop Y = 55 − 45 = 10.
(c) Range for Shop X = 55 − 35 = 20; Range for Shop Y = 65 − 40 = 25. So, Shop Y
has a larger overall range.
Data Interpretation Practice 3

Practice Problem 3: Pie Chart Analysis

Scenario:
A pie chart shows the market share of five companies:

Tech-4
A-Plus

20%
25%

15%
SoftPro 15%
25%
Others

Cube Inc

Questions:

(a) Which companies have the largest market share?

(b) If the total market is $200 million, how many dollars does Tech-4 hold?

(c) What percentage of the market is held by companies other than A-Plus and Cube Inc?

Solution
Solution:
(a) A-Plus and Cube Inc both hold 25%.
(b) Tech-4 holds 20% of $200 million, i.e., $200 million × 0.20 = $40 million.
(c) Combined share of A-Plus and Cube Inc = 25% + 25% = 50%. Thus, the remaining
market share is 100% - 50% = 50%.
Data Interpretation Practice 4

Practice Problem 4: Outlier Impact on Central Tendency

Scenario:
Consider the data set: {10, 12, 13, 14, 15}.

(a) Compute the mean and median.

(b) Now, add an outlier 50 to form {10, 12, 13, 14, 15, 50}. Compute the new mean and
median.

(c) Explain how the outlier affects the mean and median.

Solution
Solution:
(a) Original Data: {10, 12, 13, 14, 15}:

• Mean = (10 + 12 + 13 + 14 + 15)/5 = 64/5 = 12.8.

• Median = 13.

(b) New Data: {10, 12, 13, 14, 15, 50}:

• Mean = (10 + 12 + 13 + 14 + 15 + 50)/6 = 114/6 = 19.

• Median = Average of 3rd and 4th values = (13 + 14)/2 = 13.5.

(c) Explanation: The mean increased from 12.8 to 19 because it is sensitive to extreme
values, while the median changed only slightly (from 13 to 13.5), showing that the
median is less affected by outliers.
Data Interpretation Practice 5

Practice Problem 5: Frequency Distribution Analysis (Moderate)

Scenario:
The following table shows the frequency of test scores:
Score 50 60 70 80 90
Frequency 2 3 5 4 1
Questions:

(a) Compute the mean score.

(b) Identify the mode.

Solution
Solution:
(a) Mean: Multiply each score by its frequency and sum:

Total Sum = 50(2) + 60(3) + 70(5) + 80(4) + 90(1) = 100 + 180 + 350 + 320 + 90 = 1040.

Total frequency = 2 + 3 + 5 + 4 + 1 = 15. So,


1040
Mean = ≈ 69.33.
15
(b) Mode: The score 70 appears 5 times, which is the highest frequency.
Data Interpretation Practice 6

Practice Problem 6: Advanced Box Plot Analysis (Moderate)

Scenario:
A box plot for monthly sales (in $1000s) of a company shows:

• Median = 45,

• Lower quartile (Q1) = 40,

• Upper quartile (Q3) = 55,

• Lower whisker = 35,

• Upper whisker = 70.

Questions:

(a) What is the interquartile range (IQR)?

(b) Using the 1.5 IQR rule, calculate the lower and upper fences.

Solution
Solution:
(a) IQR = Q3 − Q1 = 55 − 40 = 15.
(b) Lower fence = Q1 − 1.5 × IQR = 40 − 22.5 = 17.5.
Upper fence = Q3 + 1.5 × IQR = 55 + 22.5 = 77.5.
Data Interpretation Practice 7

Practice Problem 7: Interpreting Skewness from a Histogram (Moderate)

Scenario:
A histogram of annual incomes (in $1000s) for a sample shows most incomes are between
$30k and $50k, with a long tail toward higher incomes.

(a) Is the distribution left-skewed, right-skewed, or symmetric? Explain.

(b) What does the skewness imply about the relationship between the mean and the me-
dian?

Solution
Solution:
(a) The long tail toward higher incomes indicates the distribution is right-skewed.
(b) In a right-skewed distribution, the mean is typically greater than the median,
because the high values pull the average upward.
Data Interpretation Practice 8

Practice Problem 8: Comparing Two Datasets (Moderate)

Scenario:
Dataset A: {40, 45, 50, 55, 60}
Dataset B: {35, 45, 50, 55, 75}
Questions:

(a) Compute the mean of each dataset.

(b) Based on the means and data spread, which dataset appears to be more right-skewed?
Explain.

Solution
Solution:
(a) Dataset A: Mean = (40 + 45 + 50 + 55 + 60)/5 = 250/5 = 50.
Dataset B: Mean = (35 + 45 + 50 + 55 + 75)/5 = 260/5 = 52.
(b) In Dataset A, the values are evenly spaced around 50. In Dataset B, the highest
value (75) is much larger than the rest, pulling the mean upward (mean = 52 while
the median is likely closer to 50). Thus, Dataset B appears to be more right-skewed.
Data Interpretation Practice 9

Practice Problem 9: Grouped Frequency Distribution (Tough)

Scenario:
Below is a grouped frequency distribution of exam scores:
Score Range Frequency Midpoint
40 – 50 4 45
50 – 60 6 55
60 – 70 8 65
70 – 80 5 75
80 – 90 3 85
Questions:

(a) Estimate the overall mean exam score.

(b) Estimate the variance using the midpoints.


Data Interpretation Practice 10

Solution
Solution:
(a) Estimated Mean: Multiply each midpoint by its frequency and sum:

Total Sum = (45 × 4) + (55 × 6) + (65 × 8) + (75 × 5) + (85 × 3).

= 180 + 330 + 520 + 375 + 255 = 1660.


Total frequency = 4 + 6 + 8 + 5 + 3 = 26. So,
1660
Mean = ≈ 63.85.
26
(b) Estimated Variance:
First, compute the squared differences from the mean for each midpoint:

(45 − 63.85)2 ≈ 352.82, (55 − 63.85)2 ≈ 78.62,

(65 − 63.85)2 ≈ 1.32, (75 − 63.85)2 ≈ 123.82,


(85 − 63.85)2 ≈ 448.82.
Multiply each squared difference by its frequency and sum:

Total Squared Sum ≈ (352.82×4)+(78.62×6)+(1.32×8)+(123.82×5)+(448.82×3).

≈ 1411.28 + 471.72 + 10.56 + 619.10 + 1346.46 ≈ 4259.12.


Divide by total frequency:
4259.12
Variance ≈ ≈ 163.81.
26
Data Interpretation Practice 11

Practice Problem 10: Bimodal Histogram Analysis (Tough)

Scenario:
A histogram of exam scores shows two peaks. The first peak is around 55 and the second
peak is around 75. Most students score between 45 and 65 or between 65 and 85, with a dip
in frequency around 65.

(a) What does it mean for the data to be bimodal?

(b) How might the mean and median compare in a bimodal distribution?

(c) Suggest one possible explanation for a bimodal exam score distribution.

Solution
Solution:
(a) A bimodal distribution has two distinct peaks, indicating there are two groups
within the data with different common scores.
(b) In a bimodal distribution, the mean might lie between the two modes, but the
median could be closer to the central point between them or be influenced by the
group sizes. They may not be equal.
(c) One possible explanation is that the exam was taken by two distinct groups of
students (e.g., one group that prepared well and one that did not), resulting in two
clusters of scores.

Summary
These problems cover a range of data interpretation tasks, from reading bar charts, box
plots, and pie charts to calculating means and variances from grouped data, and interpreting
distribution shapes and skewness. The questions move from basic to moderately tough,
helping you develop a deeper understanding of data summary and interpretation.

End of Document

You might also like