Mse1 Stat Class
Mse1 Stat Class
Francis J. Majawa
February 7, 2024
2/1
Important Statistical Terms.
Observation: A single member of a collection of items
that we want to study. i.e
Employee
Age
Heights e.t.c.
Variable: A characteristic/attribute that can assume
different values. i.e
x + 5 = 16 then x is a variable, it can take any other values
as long as when added to 5 should give us 16.
Age of students at the UNIMA in 2020, ’Age’ is a variable,
it can take any values.
Data: Are the values (measurements/observations) that
the variables can assume.
√
i.e. √
22 22
x = 11, x = , x = 121... so 11, , 121... are values of
2 2
a variable, hence data
Probable ages of students in UNIMA would be
23, 26, 34, 56, 78, 40 e.t.c. all are values of a variable ”Age”,
3/1
Terms Conti...
4/1
Data Set Variable Example Typical Task
Univariate 1 Income Histogram, Basic St
Bivariate 2 Income, Age Scatter plot, Correla
Multivariate 3 Income, Age, Gender Regression Modellin
5/1
Branches of Statistics
6/1
Examples
7/1
Solutions
8/1
Exercise
Read the following passage and answer questions that follows:
9/1
Exercise Questions
10 / 1
Variables.
11 / 1
1. Categorical Data.
12 / 1
2. Quantitative Variables.
13 / 1
Question.
14 / 1
Solutions
15 / 1
Types of Data
17 / 1
Nominal
18 / 1
Ordinal
Data classifications categorisation are represented by labels
or names (high, medium, low) that have relative values.
Because of the relative values, the data classified can be
ranked or ordered.
Though the data can be categorized and ranked/ordered
but the difference between ranks does not exist.
In other words, precise differences between the ranks do
not exist.
the ranks lack the properties that are required to compute
many statistics, such as the average.
Example; grades (A, B, C, D,...), Rating scale (Poor, good,
excellent, ...), satisfaction level, happiness,...
Specifically, there is no clear meaning to the distance
between A and B, or if ranks are coded with numbers, the
difference between 1 and 2 is meaningless.
what would be the distance between Rarely and Never? 19 / 1
Interval
Interval data includes all the characteristics of the ordinal
level.
And precise differences between units of measure do exist
and is a constant size; however, there is no meaningful zero.
Equal differences in the characteristic are represented by
equal differences in the measurements.
Examples include temperature, Scores, IQ,...
The interval between 60◦ C and 70◦ C is the same as the
interval between 20◦ C and 30◦ C.
Since intervals between numbers represent distances, we
can do mathematical operations such as taking an average.
But having no meaningful Zero, i.e. we can’t say that 60◦ C
is twice as warm as 30◦ C or we cannot say a temperature
of 0◦ C means there is no temperature.
20 / 1
Ratio
22 / 1
Time Series
23 / 1
Cross-section Data
24 / 1
Data Collection
25 / 1
Rules for Collecting Data.
26 / 1
Rules for Collecting Data.
27 / 1
Data Collection Tools
Participatory Methods
Records and Secondary Data
Observation
Surveys and Interviews
Focus Groups
Diaries, Journals, Self-reported Checklists
Other Tools
28 / 1
Participatory Methods
29 / 1
Community Meetings
30 / 1
Records and Secondary data
Examples of sources:
files/records
computer data bases
industry or government reports
other reports or prior evaluations
census data and household survey data
electronic mailing lists and discussion groups
documents (budgets, organizational charts, policies and
procedures, maps, monitoring reports)
newspapers and television reports
31 / 1
Using Existing Data Set
32 / 1
Advantages/Disadvantages
33 / 1
Observation
34 / 1
Observation is helpful when:
35 / 1
Ways to Record Information from Observations:
36 / 1
Guidelines for Planning Observations
37 / 1
Advantages/Disadvantages
38 / 1
Surveys and Interviews
39 / 1
Modes of Survey
Telephone surveys
Self-administered questionnaires distributed by mail,
e-mail, or websites
Administered questionnaires, common in the development
context
In development context, often issues of language and
translation
40 / 1
Advantage/Disadvantage
41 / 1
Interviews.
Often semi-structured
Used to explore complex issues in depth
Forgiving of mistakes: unclear questions can be clarified
during the interview and changed for subsequent interviews
Can provide evaluators with an intuitive sense of the
situation
42 / 1
Challenges of Interviews.
43 / 1
Focus Group
44 / 1
Focus Groups are Inappropriate when:
45 / 1
Advantage/Disadvantage
46 / 1
The Population
47 / 1
The Population
48 / 1
The Sample
49 / 1
Sampling Methods
50 / 1
Non-Probability Sampling
52 / 1
Types of Non-probability Sampling
Purposive Sampling:
Read on Purposive Sampling and write down what it
is,when to use it, advantages and disadvantages.
53 / 1
Types of Non-probability Sampling
54 / 1
Quota Sampling Example & Steps
55 / 1
Snow-ball sampling
56 / 1
Advantages & Disadvantages of Non-probability
Sampling
Advantages:
Non-probability sampling techniques are a more conducive
and practical method for researchers deploying surveys in
the real world.
Getting responses using non-probability sampling is
faster(time effective) and more cost-effective than
probability sampling because the sample is known to the
researcher. The respondents respond quickly as compared
to people randomly selected as they have a high motivation
level to participate.
Effective when it is unfeasible or impractical to conduct
probability sampling.
57 / 1
Advantages & Disadvantages of Non-probability
Sampling
Disadvantages:
Lower level of generalization of research findings compared
to probability sampling
Difficulties in estimating sampling variability and
identifying possible bias
58 / 1
Probability Sampling
59 / 1
Example Questions
60 / 1
Some Important terms
61 / 1
Some Important terms Cont...
62 / 1
Some notation
Population size: N
Sample size: n
n
Sampling fraction: f = N
63 / 1
Probability Sampling Methods
64 / 1
Simple Random sampling (SRS)
65 / 1
Replacement
66 / 1
Simple Random Sample (WITHOUT Replacement)
67 / 1
Selecting n random numbers using Excel
68 / 1
Stratified random sampling
69 / 1
Stratified random sampling cont...
70 / 1
Allocating the Sample among the Strata
71 / 1
Advantages
72 / 1
Disadvantages
73 / 1
Systematic Random Sampling
74 / 1
Example
75 / 1
Cluster Random Sampling
Cluster sampling
Used members of the study population are naturally in
groups, called clusters,
e.g villages for residence,
schools for education,
health center catchment areas for health care e.t.c.
Obtain a simple random sample of clusters
Sample members from the selected clusters only
May select only a sub-set of them
76 / 1
Cluster Sampling Example
77 / 1
Do all members of the study population have known probability
of being included in the sample?
If Yes:
7
probability a school is selected = = 0.13
54
since all students in selected schools are selected this is also
probability a student is selected
Sometimes sampling of clusters uses sampling in proportion
to size
78 / 1
What are the sampling units?
In cluster sampling the primary sampling units are the
clusters
Individuals that make up the clusters are secondary
sampling units
For the standard 1 students e.g:
primary sampling units -schools
secondary sampling units - students
79 / 1
Multistage cluster sampling
80 / 1
The End.
81 / 1