0% found this document useful (0 votes)
116 views17 pages

8409 Statistics

The document discusses an assignment for a statistics course. It includes a question about analyzing the annual rates of return of mutual funds over 5 years and calculating mean, variance, standard deviation. It also discusses concepts like descriptive statistics, measures of central tendency, variability, and confidence levels.

Uploaded by

mariambukhari77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views17 pages

8409 Statistics

The document discusses an assignment for a statistics course. It includes a question about analyzing the annual rates of return of mutual funds over 5 years and calculating mean, variance, standard deviation. It also discusses concepts like descriptive statistics, measures of central tendency, variability, and confidence levels.

Uploaded by

mariambukhari77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Assignment# 1

MARIAM BUKHARI

ALLAMA IQBAL OPEN UNIVERSITY

3rd SEMESTER: AUTUMN 2023

COURSE CODE: 8409

COURSE NAME: Statistics for Management

INSTRUCTOR NAME: Mr. Khalid Abdul Ghafoor

DUE DATE: 15th February 2024

Total Marks: 100

Pass Marks: fifty


Q.1 Fund info Corporation provides information to its subscribers to enable them to evaluate

the performance of mutual funds they are considering as potential investment vehicles, A

recent survey of funds whose stated investment goal was growth and income produced the

following data on total annual rate of return over the past five years:

Class Boundaries:

11.0-11.9 12.0-12.9 13.0-13.9 14.0-14.9 15.0-15.9 16.0-16.9 17.0-17.9 18.0-18.9 and

Frequency:

2 2 8 10 11 8 3 1

a) Calculate the mean, variance, and standard deviation of the annual rate of return for this

sample of 45 funds.

b) According to Chebyshev's theorem, between what values should at least 75 percent of the

sample observations fall? What percentage of the observations do fall in that interval?

c) Because the distribution is bell-shaped, between what values would you expect to find 68

percent of the observations? What percentage of the observations actually do fall in that

interval?

Ans:

a) Calculating Mean, variance, and standard deviation for the group data given:
Class Boundaries Frequency
11.0-11.9 2
12.0-12.9 2
13.0-13.9 8
14.0-14.9 10
15.0-15.9 11
16.0-16.9 8
17.0-17.9 3
18.0-18.9 1

Where:

x = midpoint of each class interval

Midpoint (x) = (lower limit + upper limit) / 2

f = frequency

n= No. Of observations

C.B x (midpoint) f fX (f × x) x² f× x²
11.0-11.9 11.5 2 23.0 132.25 264.50
12.0-12.9 12.5 2 25.0 156.25 312.50
13.0-13.9 13.5 8 108.0 182.25 1458.00
14.0-14.9 14.5 10 145.0 210.25 2102.50
15.0-15.9 15.5 11 170.5 240.25 2642.75
16.0-16.9 16.5 8 132.0 272.25 2178.00
17.0-17.9 17.5 3 52.5 306.25 918.75
18.0-18.9 18.5 1 18.5 342.25 342.25
Σf= 45 Σfx= 674.5 Σ f× x²= 10219.25

 Mean = X̄ = Σfx / Σ f

= 674.5 / 45

= 14.9889
Mean (X̄) = 14.9889%

 Variance (s2) = (Σ (f × x²) / n – 1) – (nx̄²/ n – 1)

= 10219.25 / 44 - 45 (14.9889) ² / 44 = 2.4828

Variance (s2) = 2.4828

 Standard Deviation (s): √ variation

Now, Taking square root of the variation

Variation= 2.4828

= √2.4828 = 1.5757

Standard Deviation (s) = 1.5757%

b) According to Chebyshev's theorem, we can expect at least 75% of the observations to

fall in the interval:

X̄ ± 2s = 14.9889 ± 2 (1.5757)

= 14.9889 ± 3.1514

14.9889 + 3.1514 = 18.1403

14.9889 - 3.1514 = 11.8375

All the observations in the 2nd through 7th classes fall in that interval and some of those in the

first and last classes may also fall in it. Hence, at least 42/45 or 93.33% of the observations fall

between 11.8375 and 18.1403.


c) We can expect roughly 68% of the observations to fall in that interval:

X̄ ± s = 14.9889 ± 1.5757

14.9889 + 1.5757 = 16.5646

14.9889 - 1.5757 = 13.4132

All the observations in the 14.0-14.9 and 15.0-15.9 fall in that interval and half of those in the

13.0-13.9 and 16.0-16.9 classes are also in that interval. Hence, about 29/45 or 64.44% of the

observations fall 13.4132 and 16.5646.


Q. 2 Discuss the following:

i. Descriptive Statistics

ii. Infantile Statistics

iii. Confidence Level

iv. Hypothesis

Descriptive Statistics
Introduction
Descriptive statistics serve as the bedrock of statistical analysis, providing researchers, analysts,

and decision-makers with a comprehensive understanding of the inherent characteristics of

their datasets. By summarizing key features such as central tendency, variability, and

distribution, descriptive statistics offer valuable insights that underpin informed decision-

making across various domains.

CENTRAL TENDENCY
At the core of descriptive statistics lies the concept of central tendency, which seeks to identify

the typical or central value around which the data revolve. The mean, median, and mode are

primary measures used to ascertain central tendency. The mean, or average, represents the

arithmetic sum of all observations divided by the total number of data points, providing a

balanced representation of the dataset. Meanwhile, the median identifies the middle value

when the data are arranged in ascending or descending order, offering robustness against

outliers. The mode highlights the most frequently occurring value in the dataset, particularly

useful for categorical data analysis.


VARIABILITY AND DISPERSION
Understanding the spread or dispersion of data is crucial for assessing the reliability and

consistency of observations. Variance and standard deviation are paramount in quantifying

variability. Variance measures the average squared deviation from the mean, providing a

comprehensive assessment of data dispersion. Standard deviation, the square root of variance,

offers a more interpretable metric, indicating the average deviation of data points from the

mean. Together, variance and standard deviation facilitate nuanced interpretations of data

consistency and reliability.

RANGE AND INTERQUARTILE RANGE (IQR)

The range, defined as the difference between the maximum and minimum values in a dataset,

provides a simplistic yet informative measure of data spread. However, its susceptibility to

outliers underscores the importance of complementary measures such as the interquartile

range (IQR). The IQR, delineated by the range between the first and third quartiles, offers a

robust measure of data spread, mitigating the influence of outliers and providing insights into

the central 50% of observations.

SHAPE AND SYMMETRY

Descriptive statistics extend beyond mere measures of central tendency and dispersion,

encompassing assessments of data distribution and symmetry. Skewness and kurtosis serve as

pivotal indicators of distribution shape and symmetry. Skewness measures the degree of

asymmetry in the data distribution, with positive skewness indicating a right-skewed


distribution and negative skewness signifying a left-skewed distribution. Meanwhile, kurtosis

delineates the peaked Ness or flatness of the distribution, with higher values denoting sharper

peaks and lower values indicating flatter distributions.

Conclusion:
In conclusion, descriptive statistics constitute an indispensable toolkit for comprehensively

summarizing and interpreting dataset characteristics. By illuminating central tendency,

variability, distribution, and symmetry, descriptive statistics empower stakeholders with the

analytical prowess necessary for data-driven decision-making. From exploratory data analysis to

hypothesis testing, a profound understanding of descriptive statistics catalyzes informed

insights and fosters a deeper appreciation of underlying data dynamics.

Infantile Statistics
Introduction
Statistics plays a crucial role in understanding and interpreting data in various fields, from

scientific research to business analytics. For beginners, grasping the foundational concepts of

statistics is essential for building a solid understanding of data analysis. This note serves as an

introduction to basic statistics, aimed at beginners or those unfamiliar with statistical concepts.

UNDERSTANDING DATA

At the heart of statistics lies data, which can be qualitative (categorical) or quantitative

(numerical). Qualitative data consists of categories or labels, while quantitative data represents
measurements or quantities. Understanding the nature of data is fundamental for selecting

appropriate statistical techniques.

MEASURES OF CENTRAL TENDENCY


Measures of central tendency provide insights into the typical or central value of a dataset. The
mean, median, and mode are common measures used for this purpose. The mean is the
arithmetic average of all values, the median is the middle value when data are arranged in
ascending or descending order, and the mode is the value that appears most frequently.

MEASURES OF VARIABILITY
Variability measures the spread or dispersion of data points around the central tendency.

Variance and standard deviation are primary measures of variability. Variance quantifies the

average squared deviation from the mean, while standard deviation represents the average

deviation of data points from the mean.

PROBABILITY BASICS
Probability is the likelihood of an event occurring and forms the foundation of statistical

inference. It ranges from 0 (impossible event) to 1 (certain event). Basic probability concepts

include the addition rule, multiplication rule, and probability distributions.

GRAPHICAL REPRESENTATION
Graphical representation of data enhances understanding and visualization. Common types of

graphs include bar graphs, histograms, pie charts, and scatterplots. These graphical tools help

convey patterns, trends, and relationships within the data.

Conclusion
In conclusion, basic statistics provide a framework for understanding and analyzing data,
making informed decisions, and drawing meaningful conclusions. By mastering fundamental
concepts such as measures of central tendency, variability, probability, and graphical
representation, beginners can develop a solid foundation in statistics that serves as a
steppingstone for more advanced analyses. Continual practice and exploration of statistical
concepts are key to building proficiency and harnessing the power of statistics in various
domains.

Confidence level
Introduction
Confidence levels play a critical role in statistical analysis, providing a measure of certainty or
uncertainty associated with estimates of population parameters. This detailed note aims to
elucidate the concept of confidence levels, their interpretation, and their practical significance
in statistical inference.

DEFINITION OF CONFIDENCE LEVEL


A confidence level represents the probability that a calculated interval (such as a confidence
interval) contains the true value of a population parameter. Expressed as a percentage,
common confidence levels include 90%, 95%, and 99%. For instance, a 95% confidence level
implies that if the sampling process were repeated numerous times, approximately 95% of the
resulting confidence intervals would capture the true population parameter.

CALCULATION OF CONFIDENCE INTERVALS


Confidence intervals are constructed based on the sampling distribution of the estimator and
the desired confidence level. For example, when estimating the population mean, a confidence
interval is calculated using the sample mean, standard deviation, sample size, and a critical
value from the appropriate distribution (e.g., t-distribution for small sample sizes or z-
distribution for large sample sizes).

INTERPRETATION OF CONFIDENCE LEVELS


Interpreting confidence levels involves understanding the level of certainty associated with the
estimate. A higher confidence level (e.g., 95% or 99%) indicates greater confidence in the
accuracy of the estimate but results in wider confidence intervals. Conversely, a lower
confidence level (e.g., 90%) yields narrower intervals but with reduced confidence in capturing
the true parameter value.
SIGNIFICANCE IN STATISTICAL INFERENCE
Confidence levels are integral to hypothesis testing and parameter estimation. In hypothesis
testing, researchers set a predetermined confidence level (e.g., 95%) to determine the
threshold for accepting or rejecting a null hypothesis. Similarly, in parameter estimation,
confidence intervals provide a range of plausible values for the population parameter,
facilitating informed decision-making and drawing reliable conclusions from data.

CONSIDERATIONS AND APPLICATIONS


The choice of confidence level depends on numerous factors, including the desired level of
precision, the risk of Type I and Type II errors, and the context of the study. While a 95%
confidence level is commonly used in many scientific studies, researchers may opt for higher or
lower confidence levels based on the specific requirements and constraints of the analysis.

Conclusion
Confidence levels serve as essential tools in statistical inference, offering insights into the
reliability and precision of estimates derived from sample data. By understanding the concept
of confidence levels and their implications, researchers can make informed decisions, draw
meaningful conclusions, and communicate the uncertainty associated with their findings
effectively.

References:

- Agresti, A., & Finlay, B. (2009). Statistical Methods for the Social Sciences (4th ed.). Pearson.

- Montgomery, D. C., & Runger, G. C. (2018). Applied Statistics and Probability for Engineers
(7th ed.). Wiley.

Hypothesis
Introduction
Hypothesis testing is a fundamental concept in statistics used to make inferences about
population parameters based on sample data. This detailed note aims to provide a
comprehensive understanding of hypothesis testing, including its definition, components,
procedures, and practical applications.
DEFINITION OF HYPOTHESIS
A hypothesis is a statement or proposition about a population parameter that is subject to
empirical testing. It typically consists of a null hypothesis (H0) and an alternative hypothesis
(H1). The null hypothesis represents the status quo or a statement of no effect, while the
alternative hypothesis proposes a specific effect, relationship, or difference.

COMPONENTS OF HYPOTHESIS TESTING


Hypothesis testing involves several key components:

- Null Hypothesis (H0): A statement that there is no significant difference, effect, or relationship
between variables.

- Alternative Hypothesis (H1): A statement that contradicts the null hypothesis, proposing a
specific difference, effect, or relationship.

- Test Statistic: A numerical value calculated from sample data used to assess the evidence
against the null hypothesis.

- Level of Significance (α): The predetermined threshold for rejecting the null hypothesis,
typically set at 0.05 or 0.01.

- P-Value: The probability of obtaining test results as extreme as or more extreme than the
observed data, assuming the null hypothesis is true.

- Decision Rule: Criteria for accepting or rejecting the null hypothesis based on the test statistic
and the level of significance.

STEPS IN HYPOTHESIS TESTING


1. Formulate Hypotheses: Define the null and alternative hypotheses based on the research
question or problem.

2. Select Test Statistic: Choose an appropriate test statistic based on the study design and
assumptions.

3. Determine Level of Significance: Set the significance level (α) to determine the threshold for
rejecting the null hypothesis.

4. Collect Data: Collect sample data relevant to the research question or hypothesis.

5. Calculate Test Statistic: Compute the test statistic using the sample data and the chosen test
method.
6. Determine P-Value: Calculate the probability of obtaining the observed test results or more
extreme results under the null hypothesis.

7. Decide: Compare the P-value to the significance level (α) and decide whether to reject or fail
to reject the null hypothesis.

8. Draw Conclusion: Interpret the results in the context of the research question and
communicate the findings accordingly.

PRACTICAL APPLICATIONS
Hypothesis testing is widely used in various fields, including science, medicine, social sciences,
business, and engineering, to:

- Evaluate the effectiveness of interventions or treatments.

- Investigate relationships between variables.

- Compare groups or populations.

- Validate theories or hypotheses.

- Make data-driven decisions and recommendations.

Conclusion
Hypothesis testing is a powerful statistical tool for making evidence-based inferences and
drawing conclusions from data. By understanding its components, procedures, and practical
applications, researchers can conduct rigorous analyses, test hypotheses, and contribute to the
advancement of knowledge in their respective fields.

References:

- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage Publications.

- Rosenthal, R., & Rosnow, R. L. (2008). Essentials of Behavioral Research: Methods and Data
Analysis (3rd ed.). McGraw-Hill.
Q. 3 The Gilbert Machinery Company has received a big order to produce electric motors for a
manufacturing company. To fit in its bearing, the drive shaft of the motor must have a
diameter of 5.1 +0.05 (inches). The company’s purchasing agent realizes that there is a large
stock of steel rods in inventory with a mean diameter of 5.07,” and a standard deviation of
0.07.” What is the probability of a steel rod from inventory fitting the bearing?

Ans: Let us denote:

P = Probability

X = The diameter of the steel rod.

µ = 5.07 inches as the mean diameter of the steel rods in inventory.

σ = 0.07 inches as the standard deviation of the diameter.

Z = Number of standard deviations from “x” to the “mean” of this distribution.

To find P (5.05 ≤ X ≤ 5.15).

we need to standardize the values using the formula for standardization:

Z=x-µ/σ

 For X = 5.05

Z1 = 5.05 - 5.07 / 0.07 = - 0.29

 For X = 5.15

Z2 = 5.15 - 5.07 / 0.07 = 1.14

Now,

P (5.05 ≤ X ≤ 5.15)

= P (-0.29 ≤ Z ≤ 1.14)

= 0.1141 + 0.3729

= 0.4870.

Therefore, the probability of a steel rod from inventory fitting the bearing is 0.4870
Q. 4 Robertson Employment Service customarily gives standard intelligence and aptitude
tests to all people who seek employment through the firm. The firm has collected data for
several years and has found that the distribution of scores is not normal but is skewed to the
left with a mean of 86 and a standard deviation of 16. What is the probability that in a sample
of 75 applicants who take the test, the mean score will be less than 84 or greater than 90?

Ans: Given that the distribution of scores is skewed to the left, the sample size of 75 is large
enough to use the Central Limit Theorem to approximate the sampling distribution of the
sample mean.

Given:

X̄ = 84, 90

µ = 86, σ = 16, n = 75

First, we calculate the standard error (SE) of the mean using the formula:

The standard error (SE) of the mean:

σ X̄ = σ/ √n

σ X̄ = σ/ √n = 16/ √75 = 1.848

Standard error of the mean = 1.848

Now we find the z-scores for the two given values: 84 and 90.

Standardizing the sample mean:

Z = X̄ - µ / σ X̄

P (X̄ < 84) = P (X̄ - µ / σ X̄ < 84 – 86 / 1.848)

= P (z < -1.08)

= 0.5 - 0.3599
= 0.1401

P (X̄ > 90) = P (X̄ - µ / σ X̄ > 90 – 86 / 1.848)

= P (z > 2.16)

= 0.5 - 0.4846

= 0.0154

Thus, P (X̄ < 84 or X̄ > 90) = 0.1401 + 0.0154 = 0.1555


Q. 5 The southeast regional manager of General Express, a private parcel-delivery firm, is
worried about the likelihood of strikes by some of his employees. He has learned that the
probability of a strike by his pilots is 0.75 and the probability of a strike by his drivers is 0.65.
Further, he knows that if the drivers' strike, there is a 90 percent chance that the pilots will
strike in sympathy.

(a) What is the probability of both groups’ striking?

(b) if the pilots’ strike, what is the probability that the drivers will strike in sympathy?

Ans:

Given information:

 The probability of strike by Pilot = 0.75


 The Probability of strike by Driver = 0.65
 Chance of the pilot’s strike = 90% = 0.90

To find:

 The probability of both groups striking.


 The probability of Drivers striking in sympathy if the Pilots' strike.

Let us suppose:

A = Pilot’s Strike

D = Driver’s Strike

a) The probability of both groups’ striking:

P (A and D) = P (A | D) P (D)

= (0.90) (0.65)

= 0.585

b) The probability of Drivers striking in sympathy if the Pilots' strike:

P (D | A) = P (A and D) / P (A)

= 0.585 / 0.75

= 0.78

You might also like