0% found this document useful (0 votes)

116 views12 pages

3.3 Assignment: One Variable Statistics: A) Histogram

sociology

Uploaded by

Young Stars

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

116 views12 pages

3.3 Assignment: One Variable Statistics: A) Histogram

sociology

Uploaded by

Young Stars

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

3.

3 Assignment: One variable statistics

a) Histogram

Data is organized into bins, and the frequency of each bin is displayed as bars in a
histogram, a graphical representation of the data's distribution. A histogram of test
results that displays the proportion of students falling into each score range might
illustrate this

b) .Measure of central tendency

This refers to statistics, such as mean, median, and mode, that characterize a dataset's
middle or typical value. An example is the average test score for a class.

c) IQR

The first quartile (Q1) is subtracted from the third quartile (Q3) to get the IQR, which
calculates the spread of the middle 50% of a dataset. For instance: The IQR for a
dataset having Q1 = 25 and Q3 = 75 would be 50.

D ) Percentile

A percentile indicates the value below which a given percentage of data points in a
dataset fall. Example: A score in the 90th percentile means that 90% of scores are
below that value.

e) Outlier

A data point that deviates noticeably from previous observations and is frequently
located far from the core values is called an outlier. Example: 30% might be seen as an
outlier in a class whose test results are typically between 80 and 90 percent.
2) Part 1: Creating the Data Set

Here is a set of 20 data points that meet the given conditions: Data set: 10, 10, 15, 18,
19, 20, 20, 22, 24, 25, 25, 25, 26, 27, 28, 30, 30, 32, 33, 35

● Mean: 20
● Median: 25
● IQR: 10
● Sample standard deviation: Between 9 and 12

Plan to Create the Data Set

To generate a dataset that satisfies the given statistical properties, I followed these
steps:

1. Target Mean (20): To get a mean of 20, I aimed to create a data set where the
sum of all data points equals 20 times the number of data points (i.e.,
20×20=400). I distributed the values around 20, with some slightly higher and
some lower.
2. Median (25): I ensured that the middle value, or the average of the two middle
values, was exactly 25. Since there are 20 data points, the 10th and 11th values
must be around 25 to make the median correct.
3. IQR (10): The IQR is the difference between the third quartile (Q3) and the first
quartile (Q1). I aimed for Q3 to be around 30 and Q1 to be around 20, so the IQR
would be approximately 30−20=10
4. Sample Standard Deviation (between 9 and 12): The standard deviation should
show a moderate spread around the mean. I varied the values to be not too
tightly clustered but not too far apart, ensuring the standard deviation stayed
within the target range.

Challenges and Adjustments

● Balancing the Mean and Standard Deviation: Keeping the standard deviation
between 9 and 12 while maintaining a mean of exactly 20 was tricky. Initially,
some data points were too far from the mean, resulting in a higher standard
deviation. I adjusted by bringing values closer to the mean without changing the
overall sum of the data points.
● Ensuring the Median and IQR: After setting the mean, I had to tweak the data to
ensure the correct median and IQR. This required careful positioning of the
middle values and some minor adjustments to maintain the desired spread of the
data.
3) Create a histogram to display the data

4) Create a relative frequency polygon for the data.

5) Create a box-and-whisker plot for the data. If you use technology copy and
paste the graph into your solutions document. If you prepare it by hand use a
ruler and take a clear, well-lit photograph or scan and insert it as an image into
your document.

6) What is the percentile of the data point 40 in the data set?

When finding the percentile of the data point 40 in the data set we must
organise the data by first listing all the values in the data set in the ascending
order

Here is the data provided:

26,45,27,50,52,12,28,48,52,14,20,32,9,36,51,42,1,30,21,40,37,35,57,47,42,5
3
The next step is to arrang these values in ascending order

1,9,12,14,20,21,26,27,28,30,32,35,36,37,40,42,42,45,47,48,50,51,52,52,53,5
7
We must now determine the position of the data point 40

The data point 40 is in the 15th position in this order list

Then we proceed to use the percentile formula

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑙𝑜𝑤 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡

Percentile = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
x100

In this question there are 14 value below 40 (the value before 40 in the
ordered list). There are a total of 26 data points so, the percentile is

14
Percentile = 26
x 100≈ 53. 85

The data point 40 is at approximate the 54th percentile.

We will now move on to find the mean, median and mode using the following
methods

Mean is the average of the data points

The valve are add together and then divided by the number of data points

Lets move on to the calculaton

Sum of value = 26+45+27+50+52+12

There values is 26

981
mean= 26
≈ 37. 73
7)Find the mean, median, and mode of the data set. Explain the method you
used.

Median

The median is the middle value when the data set is arranged in ascending
order. If there is an even number of data points, the median is the average of
the two middle values.

Steps:

• Arrange the data in ascending order:

1, 9, 12, 14, 20, 21, 26, 27, 28, 30, 32, 35, 36, 37, 40, 42, 42, 45, 47, 48, 50,
51, 52, 52, 53, 57

• Since there are 26 values (even number), the median is the

average of the 13th and 14th values in the ordered list, which are 36 and 37.

36+37
Median = 2
= 36.5

3. Mode:

The mode is the value that appears most frequently in the data set.

Steps:

• Look for any values that repeat.

Observation:
The value 52 appears twice, and so does 42. These are the most frequent
values.

Mode= 42, 52 (bimodal, as both appear twice)

Summary:

• Mean: 37.73
• Median: 36.5
• Mode: 42, 52 (bimodal)

8) What is the sample standard deviation of the data set?

To calculate the sample standard deviation of the data set, we follow these
steps:

Data Set:

26, 45, 27, 50, 52, 12, 28, 48, 52

14, 20, 32, 9, 36, 51, 42, 1
30, 21, 40, 37, 35, 57, 47, 42, 53

Steps to Calculate Sample Standard Deviation:

1. Find the mean (average):

We’ve already calculated the mean as approximately 37.73.
2. Subtract the mean from each data point and square the result:
For each data point x_i, calculate (x_i - \mu)^2, where \mu is the mean.
2 2
• For 26, (26 − 37. 73) = (− 11. 73) = 137.63
2 2
• For 45, (45 − 37. 73) =(− 11. 73) = 52.85
2 2
• For 27, (27 − 37. 73) = (− 10. 73) = 115.12
2 2
• For 50,(50 − 37. 73) = (12. 27) = 150.57
2 2
• For 52,(52 − 37. 73) =(14. 27) = 203.62
• Continue this for each value…
3. Sum all the squared differences:
The sum of all squared differences is:

2
∑(𝑥𝑖 − μ) =137.63+52.85+115.12+150.57+203.62 =4348.92

4. Divide by n - 1:
Since this is a sample, we divide by n - 1 (where n is the number of data
points). There are 26 data points, so n - 1 = 25.

4348.92
Variance= 25
=173.96

5. Take the square root of the variance to get the standard deviation:

Sample Standard Deviation= 173. 96 ≈13.19

Final Answer:

The sample standard deviation of the data set is approximately 13.19.

9)Using the IQR Method:

Step 1: Sort the Data

First, we need to sort the data in ascending order:

1, 9, 12, 14, 20, 21, 26, 27, 28, 30, 32, 35, 36, 37, 40, 42, 42, 45, 47, 48, 50,
51, 52, 52, 53, 57

Step 2: Find Quartiles

● Q1 (1st Quartile): This is the median of the lower half of the data set (not
including the overall median).
○ Lower half: 1, 9, 12, 14, 20, 21, 26, 27, 28, 30, 32
(20+21)
○ Q1 is the median of this group: 2
= 20.5
● Q3 (3rd Quartile): This is the median of the upper half of the data set
(not including the overall median).
○ Upper half: 40, 42, 42, 45, 47, 48, 50, 51, 52, 52, 53, 57
(47+48)
○ Q3 is the median of this group: 2
= 47.5

Step 3: Calculate the IQR

The Interquartile Range (IQR) is:

IQR=Q3−Q1=47.5−20.5=27

Step 4: Determine the Outlier Boundaries

Using 1.5 times the IQR, we calculate the boundaries for potential outliers:

Lower Boundary=Q1−1.5×IQR=20.5−1.5×27=20.5−40.5=−20
Upper Boundary=Q3+1.5×IQR=47.5+1.5×27=47.5+40.5=88

Any data point outside the range −20,88is considered an outlier. Since all data
points are between 1 and 57, there are no outliers based on the IQR method.

2. Using the Standard Deviation Method:

We calculated the mean as μ=37.73\mu = 37.73 and the sample standard
deviation as σ ≈13.19

Step 1: Calculate the Z-Scores

The z-score for each data point tells us how many standard deviations away
the point is from the mean:

𝑥𝑖−μ
Z-score= σ

A common rule of thumb is that any data point with a z-score greater than 3 or
less than -3 is considered an outlier (more than 3 standard deviations away
from the mean).

Step 2: Check for Outliers

We can calculate the boundaries for potential outliers based on the mean and
standard deviation:

Lower Boundary=μ−3σ = 37.73−3(13.19)=37.73−39.57=−1.84

Upper Boundary =μ+3σ = 37.73 + 3(13.19) = 37.73 + 39.57 = 77.3

Any value outside the range −1.84,77.3 is considered an outlier. Since all
values in the data set are between 1 and 57, there are no outliers based on
the standard deviation method.

Conclusion:

Based on both the IQR method and the standard deviation method, there are
no outliers in this data set. All values fall within the expected range for each
method.
10)In what ways might the graph be considered misleading or biased? Explain why
the producer of the graph may have prepared it in this way

Y-Axis Range: It may overstate the variations across episodes if the Y-axis is not set
to zero.It may be more difficult to compare the bars for live and 7-day viewers
separately for each program due to the stacked bar representation.• Bar Widths:
Perception can be distorted by uneven or irregular bar widths.• Colour Usage: The
graph's color scheme may make it challenging to discern between viewers who are
live and those who are delayed.

11. Prepare one grid with separate box-and-whisker plots of both live viewers and
7-day viewers per episode

Data Build Tool (DBT)
No ratings yet
Data Build Tool (DBT)
65 pages
8th PPT Lecture On Measures of Position
0% (1)
8th PPT Lecture On Measures of Position
19 pages
Rich Content in The Online Environment and The User Experience
100% (3)
Rich Content in The Online Environment and The User Experience
14 pages
VTU Exam Question Paper With Solution of 18MCA51 Programming Using C#.NET Jan-2021-Ms Uma B
No ratings yet
VTU Exam Question Paper With Solution of 18MCA51 Programming Using C#.NET Jan-2021-Ms Uma B
37 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
2.a What Is A Digital System? Why Are Digital Systems So Pervasive? Answer
No ratings yet
2.a What Is A Digital System? Why Are Digital Systems So Pervasive? Answer
23 pages
CVTSP1120-M01-An Introduction To Commvault
No ratings yet
CVTSP1120-M01-An Introduction To Commvault
16 pages
Final Project Proposal 3i's
No ratings yet
Final Project Proposal 3i's
16 pages
ch03 Ver3
No ratings yet
ch03 Ver3
25 pages
4 - Stat - Measures of Variation 2021
No ratings yet
4 - Stat - Measures of Variation 2021
26 pages
Analysis of Statistcal Data
No ratings yet
Analysis of Statistcal Data
46 pages
Reasoning and Problem Solving: Module Overview
No ratings yet
Reasoning and Problem Solving: Module Overview
20 pages
Ken Black QA ch03
0% (1)
Ken Black QA ch03
61 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
50 pages
Practice 3 Measures of Dispersion 2023 09 20 19 02 53
No ratings yet
Practice 3 Measures of Dispersion 2023 09 20 19 02 53
18 pages
Mid Assignment - Business Statistics - FGS - Mbus.2024.207 - KSL Harshapriya.
No ratings yet
Mid Assignment - Business Statistics - FGS - Mbus.2024.207 - KSL Harshapriya.
86 pages
AGA 3842-2022-2023. Descriptive Statistics
No ratings yet
AGA 3842-2022-2023. Descriptive Statistics
101 pages
BB Module 2 BASIC STATISTICS
No ratings yet
BB Module 2 BASIC STATISTICS
63 pages
Computation Variation and Quartile
No ratings yet
Computation Variation and Quartile
18 pages
Parameter Statistic Parameter Population Characteristic Statistic Sample Characteristic
No ratings yet
Parameter Statistic Parameter Population Characteristic Statistic Sample Characteristic
9 pages
Quantitative Methods For Management
No ratings yet
Quantitative Methods For Management
118 pages
Measures of Central Tendency & Variability: Lina, Karima, Joselyn, Arlene
No ratings yet
Measures of Central Tendency & Variability: Lina, Karima, Joselyn, Arlene
34 pages
Statistics I Chapter 2: Univariate Data Analysis
No ratings yet
Statistics I Chapter 2: Univariate Data Analysis
27 pages
Topic 1 Describing Data II
No ratings yet
Topic 1 Describing Data II
68 pages
Descriptive Statistics: Mean or Average
No ratings yet
Descriptive Statistics: Mean or Average
5 pages
Lecture 2-3 Data Analysis Location & Dispression
No ratings yet
Lecture 2-3 Data Analysis Location & Dispression
43 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
Central Tendency Variation Outliers
No ratings yet
Central Tendency Variation Outliers
59 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
STATISTICS (Averages and Variation)
No ratings yet
STATISTICS (Averages and Variation)
8 pages
Week 6+7+8
No ratings yet
Week 6+7+8
37 pages
FDSA Unit 2
No ratings yet
FDSA Unit 2
44 pages
Business Statistics CH
No ratings yet
Business Statistics CH
37 pages
Measures of Variability and Position
No ratings yet
Measures of Variability and Position
34 pages
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
No ratings yet
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
56 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
CHAPTER 1 Descriptive Statistics
No ratings yet
CHAPTER 1 Descriptive Statistics
5 pages
OS Lecture2 - CPU Scheduling
No ratings yet
OS Lecture2 - CPU Scheduling
48 pages
Chapter 03 SSM-FINAL
No ratings yet
Chapter 03 SSM-FINAL
23 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Measures of Central Tendency To Z Score
No ratings yet
Measures of Central Tendency To Z Score
33 pages
STAE Lecture Notes - LU3
No ratings yet
STAE Lecture Notes - LU3
24 pages
FORMULAS
No ratings yet
FORMULAS
16 pages
4 - Stat - Measures of Variation 2024
No ratings yet
4 - Stat - Measures of Variation 2024
27 pages
Lecture 3
No ratings yet
Lecture 3
10 pages
Data Management
No ratings yet
Data Management
50 pages
Lec1 Statistics
No ratings yet
Lec1 Statistics
30 pages
GE MODMAT Unit 4 Statistics 1
No ratings yet
GE MODMAT Unit 4 Statistics 1
14 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Standard Deviation
No ratings yet
Standard Deviation
37 pages
Measure of Variation
No ratings yet
Measure of Variation
50 pages
Lecture 2
No ratings yet
Lecture 2
38 pages
B. Data Management
No ratings yet
B. Data Management
61 pages
Chapter 5
No ratings yet
Chapter 5
6 pages
STAE Lecture Notes - LU3 - Annotated
No ratings yet
STAE Lecture Notes - LU3 - Annotated
10 pages
Program-1
No ratings yet
Program-1
15 pages
Representation of Data - 1.1.4
No ratings yet
Representation of Data - 1.1.4
6 pages
Central Tendency - Lecture Notes
No ratings yet
Central Tendency - Lecture Notes
34 pages
DDDDDD 2
No ratings yet
DDDDDD 2
5 pages
Statistics 1
No ratings yet
Statistics 1
10 pages
Answers IBS
No ratings yet
Answers IBS
13 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
Social Science Statistics (June-Aug) 2025-Topic 2
No ratings yet
Social Science Statistics (June-Aug) 2025-Topic 2
21 pages
Work Book Related To Mean, Median, Mode
No ratings yet
Work Book Related To Mean, Median, Mode
14 pages
URLLogPass Randomize
No ratings yet
URLLogPass Randomize
4,773 pages
CCNA 200-301 Chapter 27 Analyzing Cisco Wireless Architectures
No ratings yet
CCNA 200-301 Chapter 27 Analyzing Cisco Wireless Architectures
17 pages
Gog and Magog
No ratings yet
Gog and Magog
193 pages
SuccessFactors With Microsoft 365
No ratings yet
SuccessFactors With Microsoft 365
41 pages
TIM-94N / TIM-94N-B / TIM-94N-BN: Description
No ratings yet
TIM-94N / TIM-94N-B / TIM-94N-BN: Description
5 pages
Beginner's Guide To Make A Game Controller
No ratings yet
Beginner's Guide To Make A Game Controller
23 pages
SC 220: Groups and Linear Algebra B.Tech Sem-III: Subgroup
No ratings yet
SC 220: Groups and Linear Algebra B.Tech Sem-III: Subgroup
73 pages
Database Systems A Pragmatic Approach 2nd Edition Elvis C. Foster Shripad Godbol Download PDF
No ratings yet
Database Systems A Pragmatic Approach 2nd Edition Elvis C. Foster Shripad Godbol Download PDF
57 pages
Final Examination - Spring 2021 Semester Sajid Ali - 40760: Faculty of Engineering, Sciences and Technology
No ratings yet
Final Examination - Spring 2021 Semester Sajid Ali - 40760: Faculty of Engineering, Sciences and Technology
4 pages
Sasvinaa Kandasamy (DSTR Final TP053388)
No ratings yet
Sasvinaa Kandasamy (DSTR Final TP053388)
34 pages
Lab 7 Capturing and Examining The Registry (15 PTS.)
No ratings yet
Lab 7 Capturing and Examining The Registry (15 PTS.)
8 pages
Aset Class TT 04-08 Nov
No ratings yet
Aset Class TT 04-08 Nov
48 pages
GE Launches Asset Transfer System For Airlines and Lessors - Air Transport News - Aviation International News
100% (1)
GE Launches Asset Transfer System For Airlines and Lessors - Air Transport News - Aviation International News
2 pages
Simplex - Installation Instructions TFX Addressable Loop Interface
No ratings yet
Simplex - Installation Instructions TFX Addressable Loop Interface
16 pages
चरित्र प्रमाण पत्र - PDF
No ratings yet
चरित्र प्रमाण पत्र - PDF
6 pages
Logicube Imaging SOP
No ratings yet
Logicube Imaging SOP
16 pages
ĐỀ THI THỬ SỐ 47 (2019-2020)
No ratings yet
ĐỀ THI THỬ SỐ 47 (2019-2020)
6 pages
كتالوج التركيب GUY-GRIP - Dead-End 2
No ratings yet
كتالوج التركيب GUY-GRIP - Dead-End 2
4 pages
ASCC R&D Platforms (Future of Education) Regional Online Forum - Participant Administrative Note V2
No ratings yet
ASCC R&D Platforms (Future of Education) Regional Online Forum - Participant Administrative Note V2
3 pages
Chapter - 1 - Multimedia System
No ratings yet
Chapter - 1 - Multimedia System
9 pages
Software Development Tools
No ratings yet
Software Development Tools
6 pages
Harshit Patel
No ratings yet
Harshit Patel
1 page
Linear Algebra Fundamentals
From Everand
Linear Algebra Fundamentals
Kartikeya Dutta
No ratings yet
SAT Math Shortcuts
From Everand
SAT Math Shortcuts
Bella Biscotti
No ratings yet
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet