Inferential Statistics: Rashid Msba

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Inferential Statistics

Rashid
MSBA
Introduction
• The word statistics conveys a variety of
meaning to people in different walks of life.
• The word statistics comes from the Italian
words Statista ( Statement).
• The German word Statistik (Political state)
• The word Statistics today refers to either
quantitative information or a method of
dealing with quantitative or qualitative
information.
Definition
“Statistics is defined as collection, Presentation,
analysis and interpretation of numerical data”.
Statistics is the sciences and art of dealing with
figure and facts.”

“Statistics is the science of conducting studies to


collect, organize, summarize, analyze and draw
conclusions from data”.
USE & APPLICATION OF STATISTICS
• It facilitates comparisons
• It simplifies the message of figure
• It helps in formulating and testing hypothesis
• It help in prediction
Statistics in Business
• Accounting — auditing and cost estimation
• Economics — regional, national, and international
economic performance
• Finance — investments and portfolio management
• Management — human resources, compensation,
and quality management
• Management Information Systems — performance of
systems which gather, summarize, and disseminate
information to various managerial levels
• Marketing — market analysis and consumer research
• International Business — market and demographic
analysis
Statistics in Business
• Data consists of information coming from
observations, counts, measurements, or responses
• Population — the whole
– a collection of persons, objects, or items under
study
• Sample — a portion of the whole
– a subset of the population
• A parameter is a numerical description of a population
characteristic
• A statistic is a numerical description of a sample
characteristic
Statistics in Business
• Example:
• Decide whether the numerical value describes a
population parameter or a sample statistic.
• A recent survey of a sample of 450 college
students reported that the average weekly
income for students is $325.
– Because the average of $325 is based on a sample,
this is a sample statistic.
• The average weekly income for all students is
$405.
– Because the average of $405 is based on a
population, this is a population parameter.
SCALE OF MEASUREMENT
• Measurement is the process of assigning numbers or labels to objects, persons,
states, or events in accordance with specific rules to represent quantities or qualities
of attributes.

•  We do not measure specific objects, persons, etc., we measure attributes or features


that define them.

1. Nominal

2. Ordinal

3. Interval

4. Ratio

The goal is to be able to identify the type of measurement scale, and to understand
proper use and interpretation of the scale.
SCALE OF MEASUREMENT
• Nominal - Categorical variables with no inherent order or ranking
sequence such as names or classes (e.g., gender). Value may be a
numerical, but without numerical value (e.g., I, II, III). The only
operation that can be applied to Nominal variables is enumeration.
• Ordinal - Variables with an inherent rank or order, e.g. mild,
moderate, severe. Can be compared for equality, or greater or less,
but not how much greater or less.
• Interval - Values of the variable are ordered as in Ordinal, and
additionally, differences between values are meaningful, however,
the scale is not absolutely secured. Calendar dates and
temperatures on the Fahrenheit scale are examples. Addition
and subtraction, but not multiplication and division are meaningful
operations.
• Ratio - Variables with all properties of Interval plus an absolute,
non-arbitrary zero point, e.g. age, weight, temperature (Kelvin).
Addition, subtraction, multiplication, and division are all meaningful
operations.
Nominal
• There must be distinct classes but these classes have no
quantitative properties. Therefore, no comparison can be
made in terms of one category being higher than the other.
• Example: Gender
1. Male
2. Female

• Example: Geographic location


1. Punjab
2. Sindh
3. KPK
4. Baluchistan
Ordinal Level Data
• There are distinct classes but these classes have a
natural ordering or ranking. The differences can
be ordered on the basis of magnitude (greatness,
scale, size etc).
For example - final position of horses in a
thoroughbred race is an ordinal variable. The
horses finish first, second, third, fourth, and so
on. The difference between first and second is not
necessarily equivalent to the difference between
second and third, or between third and fourth.
Interval Level Data
• It is possible to compare differences in magnitude, but importantly the zero
point does not have a natural meaning. It captures the properties of
nominal and ordinal scales - used by most psychological tests.
 We can see that the same difference exists between 10o C ( 50 F) and
20 degree C ( 68 F)
 25 C ( 77F) and 35 C ( 95 F)

 But we can not say that 20C is twice as hot as a temperature of 10C

Example - Celsius temperature is an interval variable. It is meaningful


to say that 25 degrees Celsius is 3 degrees hotter than 22 degrees Celsius, and
that 17 degrees Celsius is the same amount hotter (3 degrees) than 14 degrees
Celsius. Notice, however, that 0 degrees Celsius does not have a natural
meaning. That is, 0 degrees Celsius does not mean the absence of heat!
Ratio Level Data
 It is the highest level for measurement
 This level has all the three attributes:
 Magnitude
 Equal interval
 Absolute zero point
 It represent continuous values
 Example: Biophysical parameters, Weight, Height, Volume, Blood pressure
 30 Kg is thrice of 10 kg
 20 cm is twice of 10 cm
 8 hours is four time of 2 hours
Descriptive statistics
• Descriptive statistics:
–“are procedures used to
summarize, organize, and make
sense of a set of scores or
observations.”

OR

Descriptive statistics consists of the


collection, organization, summarization and
presentation of data.
• In descriptive statistics the statistician tries
to describe a situation.
Descriptive statistics
 Measures to condense data
 Measures of central tendency
 Measures of dispersion
 Measures of relationship
Descriptive statistics
• Measures to condense data
1. Table
2. Graphs and diagrams
3. Percentages

• Table
– Frequency distribution table
• The data may be qualitative or quantitative
– Contingency table
– Multiple Response table
– Miscellaneous table
Descriptive statistics
• Graphs and diagrams
Presentation of quantitative, continuous or measured
data is through graphs. The common graphs in use are
Histogram, Frequency polygon, Frequency curve, Line
chart or graph.
The common diagrams in use are Bar diagram, Pie
diagram, Pictogram diagram, Map diagram or spot map.

• Percentages
Frequency and percentage distribution through
tabulation and graphic presentation.
Descriptive statistics
– Measures of central tendency
(Grouped & Ungrouped Data)
1. Arithmetic mean
2. Median
3. Mode
4. Geometric mean
5. Harmonic mean
– Measures of dispersion
1. Range
2. Variance
3. Standard deviation
4. Quartiles
– Measures of relationship
Correlation coefficient & Regression
Inferential statistics:

–“are procedures used that allow researchers to infer
or generalize observations made with samples to the
larger population from which they were selected.”
OR

• Inferential statistics consists of the generalizing from


samples to population, performing estimations and
hypothesis test, determining relationships among
variables, and make predictions.
In inferential statistics, the statistician tries to make
inferences from sample to
population
Inferential statistics (cont…)
• Inferential statistics help managers draw conclusions based
on limited data. When predicting the future, we don't have
a magic crystal ball, but we do have statistical strategies,
such as sampling, probability, and models.
• Marketing departments often use inferential statistics. A
company might issue a survey and ask questions about
their products. However, it's impossible to survey every
individual customer. The marketing department will
determine the appropriate sample size, or the number of
people to ask. Based on the results, statisticians can infer
the responses are representative of the larger group of
customers. Finance department use statistical modeling for
predicting budgets and capital expenditures.
Inferential statistics (cont…)
• Descriptive statistics describes data (for example, a
chart or graph) and inferential statistics allows you to
make predictions (“inferences”) from that data. With
inferential statistics, you take data from samples and
make generalizations about a population.
• For example, you might stand in a mall and ask a
sample of 100 people if they like shopping at Sears. You
could make a bar chart of yes or no answers (that
would be descriptive statistics) or you could use your
research (and inferential statistics) to reason that
around 75-80% of the population (all shoppers in all
malls) like shopping at Sears.
Inferential statistics (cont…)
• There are two main areas of inferential statistics:
1. Estimating parameters. This means taking
a statistic from your sample data (for example
the sample mean) and using it to say something
about a population parameter (i.e. the population
means).
This is where you can use
sample data to answer research questions.
For example, you might be interested in knowing if
a new cancer drug is effective. Or if breakfast helps
children perform better in schools.
Inferential statistics (cont…)
• Let’s say you have some sample data about a
potential new cancer drug. You could use
descriptive statistics to describe your sample,
including:
• Sample mean
• Sample standard deviation
• Making a bar chart or boxplot
• Describing the shape of the sample probability
distribution
Inferential statistics (cont…)
• With inferential statistics you take that sample
data from a small number of people and try to
determine if the data can predict whether the
drug will work for everyone (i.e. the
population). There are various ways you can
do this, from calculating a z-score (z-scores are
a way to show where your data would lie in
a normal distribution to post-hoc (advanced)
testing.
Inferential statistics (cont…)
• Inferential statistics use statistical models to
help you compare your sample data to other
samples or to previous research. Most
research uses statistical models called the
Generalized Linear model and
include Student’s t-tests, ANOVA (Analysis of
Variance), regression analysis and various
other models that result in straight-line
(“linear”) probabilities and results.
Key Notes
• A population is any specific collection of objects of interest. A sample is any subset or sub collection of the population,
including the case that the sample consists of the whole population, in which case it is termed a census.

• A measurement is a number or attribute computed for each member of a population or of a sample. The
measurements of sample elements are collectively called the sample data.

• A parameter is a number that summarizes some aspect of the population as a whole.


A statistic is a number computed from the sample data.

• Statistics is a collection of methods for collecting, displaying, analyzing, and drawing conclusions from data.

• Descriptive statistics is the branch of statistics that involves organizing, displaying, and describing data.

• Inferential statistics is the branch of statistics that involves drawing conclusions


about a population based on information contained in a sample taken from that
population.

• Qualitative data are measurements for which there is no natural numerical scale, but which consist of attributes, labels,
or other non-numerical characteristics.

• Quantitative data are numerical measurements that arise from a natural numerical scale.
Home work
• 1. Explain what is meant by the term population.
2. Explain what is meant by the term sample.
3. Explain how a sample differs from a population.
4. Explain what is meant by the term sample data.
5. Explain what a parameter is.
6. Explain what a statistic is.

• 7. Give an example of a population and two different characteristics that may be


of interest.
8. Describe the difference between descriptive statistics and inferential statistics.
Illustrate with an example.

• 9. Identify each of the following data sets as either a population or a sample:


a. The grade point averages (GPAs) of all students at a college.
b. The GPAs of a randomly selected group of students on a college campus.
c. The ages of the nine Supreme Court Justices of the United States on
January 1, 1842.
d. The gender of every second customer who enters a movie theater.
e. The lengths of Atlantic croakers caught on a fishing trip to the beach.

• 10. Identify the following measures as either quantitative or qualitative:


a. The 30 high-temperature readings of the last 30 days.
b. The scores of 40 students on an English test.
c. The blood types of 120 teachers in a middle school.
d. The last four digits of social security numbers of all students in a class.
e. The numbers on the jerseys of 53 football players on a team.
• 11. Identify the following measures as either quantitative or qualitative:
a. The genders of the first 40 newborns in a hospital one year.
b. The natural hair color of 20 randomly selected fashion models.
c. The ages of 20 randomly selected fashion models.
d. The fuel economy in miles per gallon of 20 new cars purchased last month.
e. The political affiliation of 500 randomly selected voters.

• 12. A researcher wishes to estimate the average amount spent per person by visitors to a theme park. He
takes a random sample of forty visitors and obtains an average of $28 per person.
• a. What is the population of interest?
b. What is the parameter of interest?
c. Based on this sample, do we know the average amount spent per person by
visitors to the park? Explain fully.

• 13. A researcher wishes to estimate the average weight of newborns in South America in the last five years.
He takes a random sample of 235 newborns and obtains an average of 3.27 kilograms.
a. What is the population of interest?
b. What is the parameter of interest?
c. Based on this sample, do we know the average weight of newborns in
South America? Explain fully.

• 14. A sociologist wishes to estimate the proportion of all adults in a certain region who have never married.
In a random sample of 1,320 adults, 145 have never married, hence 145∕1320 ≈ .11 or about 11% have never
married.
a. What is the population of interest?
b. What is the parameter of interest?
c. What is the statistic involved?
d. Based on this sample, do we know the proportion of all adults who have
never married? Explain fully.
Answers
• 1. A population is the total collection of objects that are of interest in a statistical
study.
3. A sample, being a subset, is typically smaller than the population. In a
statistical study, all elements of a sample are available for observation, which
is not typically the case for a population.
5. A parameter is a value describing a characteristic of a population. In a
statistical study the value of a parameter is typically unknown.
7. All currently registered students at a particular college form a population. Two
population characteristics of interest could be the average GPA and the
proportion of students over 23 years.
9. a. Population.
b. Sample.
c. Population.
d. Sample.
e. Sample.
11. a. Qualitative.
b. Qualitative.
c. Quantitative.
d. Quantitative.
e. Qualitative.
13. a. All newborn babies in South America in the last five years.
b. The average birth weight of all newborn babies in South America in the
last five years.
c. No, not exactly, but we know the approximate value of the average.
14. a. All adults in the region.
b. The proportion of the adults in the region who have never married.
c. The proportion computed from the sample, 0.1.
d. No, not exactly, but we know the approximate value of the proportion.

You might also like