0% found this document useful (0 votes)
228 views13 pages

Chapter 1 Sta404

This document provides an introduction to statistics, covering key concepts such as: 1) Statistics involves collecting, organizing, analyzing, and interpreting quantitative data to draw conclusions under uncertainty. 2) There is a difference between populations, samples, parameters, and statistics. Parameters describe populations while statistics describe samples. 3) Descriptive statistics simply describes or summarizes data, while inferential statistics allows generalizing beyond the sample to the population. 4) Variables can be quantitative (discrete or continuous) or qualitative, and data can come from primary or secondary sources. Levels of measurement include nominal, ordinal, interval, and ratio.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
228 views13 pages

Chapter 1 Sta404

This document provides an introduction to statistics, covering key concepts such as: 1) Statistics involves collecting, organizing, analyzing, and interpreting quantitative data to draw conclusions under uncertainty. 2) There is a difference between populations, samples, parameters, and statistics. Parameters describe populations while statistics describe samples. 3) Descriptive statistics simply describes or summarizes data, while inferential statistics allows generalizing beyond the sample to the population. 4) Variables can be quantitative (discrete or continuous) or qualitative, and data can come from primary or secondary sources. Levels of measurement include nominal, ordinal, interval, and ratio.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

CHAPTER 1: INTRODUCTION TO STATISTICS

1.1 WHAT IS STATISTICS?

Numbers can’t “talk” but they can tell you as much as your human
sources can.
But just like human sources, you have to ask them!

Source: www.robertniles.com/stats/

1. Statistics is a field of study that deals with techniques that include methods
to
a) collect,
b) organize,
c) present,
d) analyze, and
e) interpret data.

2. It provides ways to make intelligent judgments and informed decisions in the


presence of uncertainty and variation.

Example 1: How can statistical techniques be used to gather information and


draw conclusion?

Suppose that a materials engineer has developed a coating for


retarding corrosion in metal pipe under specified circumstances.
If this coating is applied to different segments of pipe, variation in
environmental conditions and in the segments themselves will

LECTURER: U. H LAU 1
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

result in more substantial corrosion on some segments than on


others.

Methods of statistical analysis could be used on data from such


an experiment to decide whether the average amount of
corrosion exceeds an upper specification limit of some sort or to
predict how much corrosion will occur on a single piece of pipe.

1.2 BASIC CONCEPTS: POPULATION AND SAMPLE

Population
Sample
Parameter
  Statistic
   x
 2  p
 s2

1. A population is the complete collection of measurements, objects, or


individual under study.
2. A sample is a portion or subset taken from a population.
3. A parameter is a number that describe a population characteristic.
4. A statistic is a number that describe a sample characteristic.

Example 2: You are about to carry out a survey on the income of the
families in Malaysia. In this case:

Population: All the families in Malaysia


Sample: Families in Sarawak only.
Parameter: Average income for all the families in Malaysia
Statistic: Average income for the families in Sarawak only.

LECTURER: U. H LAU 2
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

1.3 TYPES OF STATISTICS

1. Statistics comprises of two important areas, descriptive statistics and


inferential statistics. Both areas have different objectives in analysis.

Applied Statistics

Descriptive Statistics Inferential Statistics

2. Descriptive statistics is dealing with the methods employed when we only


intend to describe or summarize our data at surface level. It involves
calculations of various descriptive measures such as mean, variance and
standard deviation from a sample. The results interpreted from these
descriptive measures will only be applicable to the sample from which the
measures are calculated.

3. Inferential statistics serves as an analysis technique to make


generalizations of any kind that lies outside the scope of descriptive statistics.
The hypothesis testing and estimation procedure are usually employed here
to verify if the measures obtained from the sample do really reflect the actual
population characteristics. In short, inferential statistics is making inferences
on a larger population based on sample characteristics.

Example 3:

For each of the situations below, decide whether the indicated study is
descriptive or inferential. Give a reason for each of your answer.
a) A faculty determines the percentage of all its staff that has PhD
qualification.
b) An officer estimates the average number of cars owned by each
household for all residents in Kuching City.
c) A sport news writer lists the winning times for all the track sprinting events
in the 2008 Beijing Olympics.
d) A scientist estimates the percentage of Sunflower seeds that will
germinate when exposed to temperature below 0oC.

LECTURER: U. H LAU 3
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

1.4 TYPES OF VARIABLE

Variable

Quantitative Qualitative
(numerical) (categorical)

Discrete Continuous

1. A variable is a characteristic under study that takes different values.

2. Quantitative variable (numerical)

- is a variable that can be measured numerically.


Examples: number of children, monthly income

- may be classified as either discrete variables or continuous variables

(a) Discrete variable


- have limited number of values that are collected by counting
- can assume only certain values with no intermediate values.
Example: number of cars sold

(b) Continuous variable


- can assume any numerical value over a certain interval or
intervals.
Example: time taken to finish a test.

3. Qualitative variable (categorical)

- Variable that cannot be measured numerically but can be divided into


different categories.
Examples: Gender - measured as ‘male or ‘female’.

Opinion - measured as ‘strongly disagree’, ‘disagree’,


‘neutral’, ‘agree’ or ‘strongly agree’.

LECTURER: U. H LAU 4
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

1.5 TYPES OF DATA

1. Data is the list of measurements (or information) taken on the variable of


interest gathered from a sample (or population). Data can be categorized into
two types based on its sources.

2. Primary data is the first hand data which is gathered by researchers from
primary sources.

Example: Data obtained by the researchers from interviewing


respondents at a shopping complex to get information from
them.

Advantage: Data obtained will meet the needs of the researchers without
much modification, since this data is gathered by the
researchers based on their need.

Disadvantage: Needs a lot of time and effort.


May incur a lot of cost.

3. Secondary data is the existing data obtained from databases where the data
has been already gathered by other people before hand (published annual
reports or journals)

Example: Data obtained from Malaysian Palm Oil Board Annual Report,
Annual Economic Report and Thompson Gale Databases.

Advantage: Cheaper.
Less time and effort is needed.

Disadvantage: May not be able to provide the exact information.


Inaccurate due to technical errors in the printing.

LECTURER: U. H LAU 5
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

1.6 LEVELS OF MEASUREMENT

1. Measurement is the quantification of observations or responses obtained by


assigning numbers or symbols to them according to a given set of rules.

2. The quantified responses fall into one of the four levels of measurement:
nominal, ordinal, interval or ratio, with different properties.

Ratio strongest / highest level of measurement


Interval
Ordinal
Nominal weakest / lowest level of measurement

a) Nominal

- Responses are grouped into categories.


- Categories have no order of importance.

- Example:

i) Gender: Male
Female

ii) Do you stay together with your parents? Yes


No

iii) Religion: Muslim


Buddhist
Christian
Others

LECTURER: U. H LAU 6
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

b) Ordinal

- Responses are grouped into categories.


- Categories have order of importance.

- Example:

i) Grade obtained in QMT181: A


B
C
D
E

ii) How long have you stayed in the current address?

Less than 1 year


1 – 5 years
6 – 10 years
More than 10 years

iii) Please indicate how often you drink soda:

1. never 2. sometimes 3. often 4. always

c) Interval

- Distance between two values has the same meaning.


- Has an arbitrary zero point.

- Example:

i) The service after sales offered by this company is:


1 2 3 4 5 6 7
good: ______ : ______ : ______ : ______ : ______ : ______ : ______ : bad
extremely quite slightly neither slightly quite extremely

ii) The patient’s body temperature: _____ oC

iii) Year born: ______

LECTURER: U. H LAU 7
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

d) Ratio

- Indicate the number of times a value is more or less than that of another
value.
- Has an an absolute zero point.

- Example:

i) Please indicate your age: _______ years

ii) The number of family members staying in the same house: ______

iii) Your current body weight: ________kg

Example 4:

A company conducted a survey on their customers’ satisfaction on the


company’s products. Some questions asked in the survey were:

a) What is your gender?


b) Which year did you first bought our product?
c) How many products have you bought from this company?
d) How long on average do you need to wait for the products ordered from
this company to reach you?
e) Give the ranking of the service after sales provided by this company.
Poor Average Good

For the questions asked in the survey, identify the corresponding level of
measurements by giving your justification.

LECTURER: U. H LAU 8
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

1.7 SAMPLING TECHNIQUES

1. Sampling is a process of selecting a sample from the population.

Type of Sampling

Probability Non Probability

Simple random Stratified Multistage Convenient Quota

Systematic Cluster Judgmental Snowball

2. In probability sampling, sampling frame is a must. The sample selected is


unbiased. Among the probability sampling includes:

a) Simple random sampling


- is used when population is homogeneous.

b) Stratified sampling
- is used when population is heterogeneous.

c) Systematic sampling
- is used when population is heterogeneous and expanding fast.

d) Cluster sampling
- is used when population is heterogeneous and involve wide geographical
area.

e) Multistage sampling
- is used when population is heterogeneous and involve huge
geographical area.

LECTURER: U. H LAU 9
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

3. Non probability sampling is usually used when samping frame is not available.
The sample selected is usually biased. Among the probability sampling includes

a) Convenience sampling
- sample is selected based on the ease to get the sample.

b) Judmental sampling
- sample is selected based on the expertise of the researcher.

c) Quota sampling
- sample is chosen based on certain quota in the population.

d) Snowball sampling
- sample is chosen starting with a particular individual and continue in
chain-like manner.

1.8 DATA COLLECTION METHODS

1. After deciding on sampling techniques, a researcher has to determine the


appropriate techniques to obtain information from the sample selected. These
techniques are referred to as data collection methods. It covers personal
interview, telephone interview, mailed questionnaire, direct observation and
others.

a) Personal Interview

One of the common ways of collecting information from respondents is by


approaching these respondents and asks them relevant questions. This is
known as personal interview or face-to-face interview.

Advantages:

i) Respondents will give answers spontaneously.


ii) The response rate is high.
iii) Interviewers can clarify respondents’ doubts.

Disadvantages:

i) Interviewers’ gestures may influence respondents’ answers.


ii) Interviewers need to be properly trained and this may incur more cost.

LECTURER: U. H LAU 10
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

b) Telephone Interview

When interviews are conducted through telephone, this data collection


method is called telephone interview. Telephone interviews are becoming
more and more common and popular as face-to-face interview is becoming
more costly.

Advantages:

The advantages are like face to face interview with the following additional
advantages.

i) It is less costly in the sense that the interviewers do not have to travel to
the respondents’ places.
ii) Interviewers can have the interview materials in front of them as the
interview is taking place.

Disadvantages:

i) This method can only be used for respondents that have telephone
connections.
ii) Respondents may not be comfortable talking too long on the telephone.

c) Mailed Questionnaire

In mailed questionnaire, a written questionnaire is sent to the respondent.


The respondents are required to answer the questionnaire and sent them
back to an indicated address by the date stated in the questionnaire. The
advantages and disadvantages of mailed questionnaire are as listed below:

Advantages:

i) It has a wide coverage.


ii) It is less expensive compared to interviews.
iii) Respondents can answer the questionnaire at any time convenient to
them.

Disadvantages:

i) The response rate is low.


ii) The questionnaire may not be answered by the target group.

LECTURER: U. H LAU 11
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

d) Direct Observation

Direct observation is a technique used when observation on subject is to be


taken from natural setting. In this technique, observers are not allowed to
interfere in the subjects’ normal interaction with subject under study.
Respondents are observed without their knowing to avoid the “Hawthorne
effect”, that is, the effect of people usually perform better under observation
because of the attention paid to them. The advantages and disadvantages of
direct observation are as listed below:

Advantages:

i) It has high face validity.


ii) It provides flexibility in changing the approach of observation according to
the need.

Disadvantages:

i) Different observers may have different interpretations on the observation.


ii) Observational data is not usually generalizable.

LECTURER: U. H LAU 12
STA 404 : STATISTICS FOR BUSINESS AND SOCIAL SCIENCES CHAPTER 1

APPENDIX: PROBABILITY SAMPLING TECHNIQUES

(a) Simple Random Sampling

Method 1:
i) Number all the units in the population.
ii) Put all the units in the population into a box.
iii) Randomly select n units from the box, for the sample required.

Method 2:
i) Number all the units in the population.
ii) Randomly select n units based on the random number digits table, for the sample
required.

(b) Systematic Sampling

i) Randomly number the list of all the units in the population.


ii) Divide the list into portions of K units
N n is the sample size
K= N is the population size
n
iii) Randomly select the first unit from the first portion.
iv) Select every Kth unit from the previously selected unit until n units are selected, for the
sample required.

(c) Stratified Sampling

i) Divide the population into strata.


ii) Calculate the number of units to be taken from each stratum (proportionate to the size of
the stratum) according to the formula:
Ni ni is the sample size for the i th stratum
ni = xn n is the total sample size
N Ni is the population size for the i th stratum
N is the total population size
iii) Select ni units from each of the stratum based on the number calculated to get n units for
the sample required.

(d) Cluster Sampling

i) Divide the population into clusters.


ii) Randomly choose the clusters.
iii) All of the units in the chosen clusters will be taken as the sample.

(e) Multistage Sampling

i) Divide the population into clusters.


ii) Randomly choose the clusters.
iii) Divide the chosen clusters into subclusters. *
iv) Randomly choose the subclusters. *
v) All of the units in the chosen subclusters will be taken as the sample.

* Step iii) and iv) will be repeated if the subclusters are still large.

LECTURER: U. H LAU 13

You might also like