8409 Statistics
8409 Statistics
MARIAM BUKHARI
the performance of mutual funds they are considering as potential investment vehicles, A
recent survey of funds whose stated investment goal was growth and income produced the
following data on total annual rate of return over the past five years:
Class Boundaries:
Frequency:
2 2 8 10 11 8 3 1
a) Calculate the mean, variance, and standard deviation of the annual rate of return for this
sample of 45 funds.
b) According to Chebyshev's theorem, between what values should at least 75 percent of the
sample observations fall? What percentage of the observations do fall in that interval?
c) Because the distribution is bell-shaped, between what values would you expect to find 68
percent of the observations? What percentage of the observations actually do fall in that
interval?
Ans:
a) Calculating Mean, variance, and standard deviation for the group data given:
Class Boundaries Frequency
11.0-11.9 2
12.0-12.9 2
13.0-13.9 8
14.0-14.9 10
15.0-15.9 11
16.0-16.9 8
17.0-17.9 3
18.0-18.9 1
Where:
f = frequency
n= No. Of observations
C.B x (midpoint) f fX (f × x) x² f× x²
11.0-11.9 11.5 2 23.0 132.25 264.50
12.0-12.9 12.5 2 25.0 156.25 312.50
13.0-13.9 13.5 8 108.0 182.25 1458.00
14.0-14.9 14.5 10 145.0 210.25 2102.50
15.0-15.9 15.5 11 170.5 240.25 2642.75
16.0-16.9 16.5 8 132.0 272.25 2178.00
17.0-17.9 17.5 3 52.5 306.25 918.75
18.0-18.9 18.5 1 18.5 342.25 342.25
Σf= 45 Σfx= 674.5 Σ f× x²= 10219.25
Mean = X̄ = Σfx / Σ f
= 674.5 / 45
= 14.9889
Mean (X̄) = 14.9889%
Variation= 2.4828
= √2.4828 = 1.5757
X̄ ± 2s = 14.9889 ± 2 (1.5757)
= 14.9889 ± 3.1514
All the observations in the 2nd through 7th classes fall in that interval and some of those in the
first and last classes may also fall in it. Hence, at least 42/45 or 93.33% of the observations fall
X̄ ± s = 14.9889 ± 1.5757
All the observations in the 14.0-14.9 and 15.0-15.9 fall in that interval and half of those in the
13.0-13.9 and 16.0-16.9 classes are also in that interval. Hence, about 29/45 or 64.44% of the
i. Descriptive Statistics
iv. Hypothesis
Descriptive Statistics
Introduction
Descriptive statistics serve as the bedrock of statistical analysis, providing researchers, analysts,
their datasets. By summarizing key features such as central tendency, variability, and
distribution, descriptive statistics offer valuable insights that underpin informed decision-
CENTRAL TENDENCY
At the core of descriptive statistics lies the concept of central tendency, which seeks to identify
the typical or central value around which the data revolve. The mean, median, and mode are
primary measures used to ascertain central tendency. The mean, or average, represents the
arithmetic sum of all observations divided by the total number of data points, providing a
balanced representation of the dataset. Meanwhile, the median identifies the middle value
when the data are arranged in ascending or descending order, offering robustness against
outliers. The mode highlights the most frequently occurring value in the dataset, particularly
variability. Variance measures the average squared deviation from the mean, providing a
comprehensive assessment of data dispersion. Standard deviation, the square root of variance,
offers a more interpretable metric, indicating the average deviation of data points from the
mean. Together, variance and standard deviation facilitate nuanced interpretations of data
The range, defined as the difference between the maximum and minimum values in a dataset,
provides a simplistic yet informative measure of data spread. However, its susceptibility to
range (IQR). The IQR, delineated by the range between the first and third quartiles, offers a
robust measure of data spread, mitigating the influence of outliers and providing insights into
Descriptive statistics extend beyond mere measures of central tendency and dispersion,
encompassing assessments of data distribution and symmetry. Skewness and kurtosis serve as
pivotal indicators of distribution shape and symmetry. Skewness measures the degree of
delineates the peaked Ness or flatness of the distribution, with higher values denoting sharper
Conclusion:
In conclusion, descriptive statistics constitute an indispensable toolkit for comprehensively
variability, distribution, and symmetry, descriptive statistics empower stakeholders with the
analytical prowess necessary for data-driven decision-making. From exploratory data analysis to
Infantile Statistics
Introduction
Statistics plays a crucial role in understanding and interpreting data in various fields, from
scientific research to business analytics. For beginners, grasping the foundational concepts of
statistics is essential for building a solid understanding of data analysis. This note serves as an
introduction to basic statistics, aimed at beginners or those unfamiliar with statistical concepts.
UNDERSTANDING DATA
At the heart of statistics lies data, which can be qualitative (categorical) or quantitative
(numerical). Qualitative data consists of categories or labels, while quantitative data represents
measurements or quantities. Understanding the nature of data is fundamental for selecting
MEASURES OF VARIABILITY
Variability measures the spread or dispersion of data points around the central tendency.
Variance and standard deviation are primary measures of variability. Variance quantifies the
average squared deviation from the mean, while standard deviation represents the average
PROBABILITY BASICS
Probability is the likelihood of an event occurring and forms the foundation of statistical
inference. It ranges from 0 (impossible event) to 1 (certain event). Basic probability concepts
GRAPHICAL REPRESENTATION
Graphical representation of data enhances understanding and visualization. Common types of
graphs include bar graphs, histograms, pie charts, and scatterplots. These graphical tools help
Conclusion
In conclusion, basic statistics provide a framework for understanding and analyzing data,
making informed decisions, and drawing meaningful conclusions. By mastering fundamental
concepts such as measures of central tendency, variability, probability, and graphical
representation, beginners can develop a solid foundation in statistics that serves as a
steppingstone for more advanced analyses. Continual practice and exploration of statistical
concepts are key to building proficiency and harnessing the power of statistics in various
domains.
Confidence level
Introduction
Confidence levels play a critical role in statistical analysis, providing a measure of certainty or
uncertainty associated with estimates of population parameters. This detailed note aims to
elucidate the concept of confidence levels, their interpretation, and their practical significance
in statistical inference.
Conclusion
Confidence levels serve as essential tools in statistical inference, offering insights into the
reliability and precision of estimates derived from sample data. By understanding the concept
of confidence levels and their implications, researchers can make informed decisions, draw
meaningful conclusions, and communicate the uncertainty associated with their findings
effectively.
References:
- Agresti, A., & Finlay, B. (2009). Statistical Methods for the Social Sciences (4th ed.). Pearson.
- Montgomery, D. C., & Runger, G. C. (2018). Applied Statistics and Probability for Engineers
(7th ed.). Wiley.
Hypothesis
Introduction
Hypothesis testing is a fundamental concept in statistics used to make inferences about
population parameters based on sample data. This detailed note aims to provide a
comprehensive understanding of hypothesis testing, including its definition, components,
procedures, and practical applications.
DEFINITION OF HYPOTHESIS
A hypothesis is a statement or proposition about a population parameter that is subject to
empirical testing. It typically consists of a null hypothesis (H0) and an alternative hypothesis
(H1). The null hypothesis represents the status quo or a statement of no effect, while the
alternative hypothesis proposes a specific effect, relationship, or difference.
- Null Hypothesis (H0): A statement that there is no significant difference, effect, or relationship
between variables.
- Alternative Hypothesis (H1): A statement that contradicts the null hypothesis, proposing a
specific difference, effect, or relationship.
- Test Statistic: A numerical value calculated from sample data used to assess the evidence
against the null hypothesis.
- Level of Significance (α): The predetermined threshold for rejecting the null hypothesis,
typically set at 0.05 or 0.01.
- P-Value: The probability of obtaining test results as extreme as or more extreme than the
observed data, assuming the null hypothesis is true.
- Decision Rule: Criteria for accepting or rejecting the null hypothesis based on the test statistic
and the level of significance.
2. Select Test Statistic: Choose an appropriate test statistic based on the study design and
assumptions.
3. Determine Level of Significance: Set the significance level (α) to determine the threshold for
rejecting the null hypothesis.
4. Collect Data: Collect sample data relevant to the research question or hypothesis.
5. Calculate Test Statistic: Compute the test statistic using the sample data and the chosen test
method.
6. Determine P-Value: Calculate the probability of obtaining the observed test results or more
extreme results under the null hypothesis.
7. Decide: Compare the P-value to the significance level (α) and decide whether to reject or fail
to reject the null hypothesis.
8. Draw Conclusion: Interpret the results in the context of the research question and
communicate the findings accordingly.
PRACTICAL APPLICATIONS
Hypothesis testing is widely used in various fields, including science, medicine, social sciences,
business, and engineering, to:
Conclusion
Hypothesis testing is a powerful statistical tool for making evidence-based inferences and
drawing conclusions from data. By understanding its components, procedures, and practical
applications, researchers can conduct rigorous analyses, test hypotheses, and contribute to the
advancement of knowledge in their respective fields.
References:
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage Publications.
- Rosenthal, R., & Rosnow, R. L. (2008). Essentials of Behavioral Research: Methods and Data
Analysis (3rd ed.). McGraw-Hill.
Q. 3 The Gilbert Machinery Company has received a big order to produce electric motors for a
manufacturing company. To fit in its bearing, the drive shaft of the motor must have a
diameter of 5.1 +0.05 (inches). The company’s purchasing agent realizes that there is a large
stock of steel rods in inventory with a mean diameter of 5.07,” and a standard deviation of
0.07.” What is the probability of a steel rod from inventory fitting the bearing?
P = Probability
Z=x-µ/σ
For X = 5.05
For X = 5.15
Now,
P (5.05 ≤ X ≤ 5.15)
= P (-0.29 ≤ Z ≤ 1.14)
= 0.1141 + 0.3729
= 0.4870.
Therefore, the probability of a steel rod from inventory fitting the bearing is 0.4870
Q. 4 Robertson Employment Service customarily gives standard intelligence and aptitude
tests to all people who seek employment through the firm. The firm has collected data for
several years and has found that the distribution of scores is not normal but is skewed to the
left with a mean of 86 and a standard deviation of 16. What is the probability that in a sample
of 75 applicants who take the test, the mean score will be less than 84 or greater than 90?
Ans: Given that the distribution of scores is skewed to the left, the sample size of 75 is large
enough to use the Central Limit Theorem to approximate the sampling distribution of the
sample mean.
Given:
X̄ = 84, 90
µ = 86, σ = 16, n = 75
First, we calculate the standard error (SE) of the mean using the formula:
σ X̄ = σ/ √n
Now we find the z-scores for the two given values: 84 and 90.
Z = X̄ - µ / σ X̄
= P (z < -1.08)
= 0.5 - 0.3599
= 0.1401
= P (z > 2.16)
= 0.5 - 0.4846
= 0.0154
(b) if the pilots’ strike, what is the probability that the drivers will strike in sympathy?
Ans:
Given information:
To find:
Let us suppose:
A = Pilot’s Strike
D = Driver’s Strike
P (A and D) = P (A | D) P (D)
= (0.90) (0.65)
= 0.585
P (D | A) = P (A and D) / P (A)
= 0.585 / 0.75
= 0.78