0% found this document useful (0 votes)

18 views81 pages

Mse1 Stat Class

The document provides an introduction to statistics, defining it as the science of collecting, organizing, summarizing, analyzing, and drawing conclusions from data. It covers important statistical terms, branches of statistics, types of data, and methods of data collection, emphasizing the significance of statistics in research and decision-making. Additionally, it discusses the classification of variables and measurement scales, along with practical examples and exercises to reinforce understanding.

Uploaded by

brinokamangat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views81 pages

Mse1 Stat Class

Uploaded by

brinokamangat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Introduction to Statistics

Francis J. Majawa

Malawi University of Business and Applied Sciences

February 7, 2024

Francis J. Majawa (Malawi University of Business

Introduction
and Applied
to Statistics
Sciences) February 7, 2024 1 / 16
Definition.

Statistics is the science of conducting studies to collect,

organize, summarize, analyze, and draw conclusions from data.
[Bluman Chapter 1]
Statistics;
Is basic to research
Equip students to understand various statistical studies
performed in their respective fields.
Train students to become better consumers and citizens,
i.e. in making intelligent decisions.

2/1
Important Statistical Terms.
Observation: A single member of a collection of items
that we want to study. i.e
Employee
Age
Heights e.t.c.
Variable: A characteristic/attribute that can assume
different values. i.e
x + 5 = 16 then x is a variable, it can take any other values
as long as when added to 5 should give us 16.
Age of students at the UNIMA in 2020, ’Age’ is a variable,
it can take any values.
Data: Are the values (measurements/observations) that
the variables can assume.
√
i.e. √
22 22
x = 11, x = , x = 121... so 11, , 121... are values of
2 2
a variable, hence data
Probable ages of students in UNIMA would be
23, 26, 34, 56, 78, 40 e.t.c. all are values of a variable ”Age”,
3/1
Terms Conti...

Population: Are all subjects (human or not) that are

being studied.
Random Variable: Values of a variable that are
determined by chance/probability.
Data Set:
A collection of data values.
Can either consist of one variable or many variables.
If it consist of one variable it is called a Univariate data
set
If it consist of two variables it is called a Bivariate data
set
if it consist of three and above variables it is called a
Multivariate data set

4/1
Data Set Variable Example Typical Task
Univariate 1 Income Histogram, Basic St
Bivariate 2 Income, Age Scatter plot, Correla
Multivariate 3 Income, Age, Gender Regression Modellin

Sample: Is a group of subjects selected from a population.

5/1
Branches of Statistics

Statistics is divided into two branches depending on how data

has been used.
1 Descriptive statistics: Which involves collection,
organization, summarization, and presentation of data.
2 Inferential statistics: Which involves generalizing from
samples to populations, performing estimations and
hypothesis tests, determining relationships among
variables, and making predictions.

6/1
Examples

Determine whether descriptive or inferential statistics were

used.
1 The average jackpot for the top five lottery winners was
K367.6 million.
2 A study done by the American Academy of Neurology
suggests that older people who had a high caloric diet more
than doubled their risk of memory loss.
3 Based on a survey of 9317 consumers done by the National
Retail Federation, the average amount that consumers
spent on Valentines Day in 2011 was $116.
4 Scientists at the University of Oxford in England found
that a good laugh significantly raises a persons pain level
tolerance.

7/1
Solutions

1 Descriptive statistics were used because this is an

average, and it is based on data obtained from the top five
lottery winners at this time.
2 Inferential statistics were used since this is a
generalization made from a sample to a population.
3 Descriptive statistics were used since this is an average
based on a sample of 9317 respondents.
4 Inferential statistics were used since an inference is
made from a sample to a population.

8/1
Exercise
Read the following passage and answer questions that follows:

A study conducted at Domasi college revealed that students

who attended class 95 to 100% of the time usually received an A
in the class. Students who attended class 80 to 90% of the time
usually received a B or C in the class. Students who attended
class less than 80% of the time usually received a D or an F or
eventually withdrew from the class.Based on this information,
attendance and grades are related. The more you attend class,
the more likely it is you will receive a higher grade. If you
improve your attendance, your grades will probably improve.
Many factors affect your grade in a course. One factor that you
have considerable control over is attendance. You can increase
your opportunities for learning by attending class more often.

9/1
Exercise Questions

1 What are the variables under study?

2 What are the data in the study?
3 Are descriptive, inferential, or both types of statistics used?
4 What is the population under study?
5 Was a sample collected? If so, from where?
6 From the information given, comment on the relationship
between the variables.

10 / 1
Variables.

Variables in statistics are classified into two categories.

Qualitative Variable:
Also known as Categorical variable.
Variables that can be placed into distinct or specific
categories according to some characteristic/attribute.
i.e. people are classified according to gender(female or
male), hence gender is categorical variable.
1.e. people are classified according to blood group, A, AB,
O, B e.t.c. hence blood type is a categorical variable.
Quantitative Variable:
Also known as Numerical variables.
These are variables that can be counted or measured.
i.e. Age, height, number of shops in a market e.t.c.

11 / 1
1. Categorical Data.

Can be represented with label. i.e.

Vehicle type: Car, Bus, Tracks
Gender : Female, Male
Can also be coded, thus using numbers to represent
categories to facilitate statistical analysis. i.e
Vehicle Type:1 = Car, 2 = Track, 3 = Bus
Gender: 0 = Male, 1 = Female
Note: Coding a category as a number does not make the data
numerical.

12 / 1
2. Quantitative Variables.

Quantitative variables are further grouped into two:

Discrete Variable: Quantitative variables whose values
can be counted. i.e. Age, number of children in a family
e.t.c.
Continuous Variable: Quantitative variables that
assumes an infinite number of values between any two
specific values. They are obtained by measuring. They
often include fractions and decimals. i.e. height, weight,
temperature, haemoglobin level e.t.c.

13 / 1
Question.

Classify each variable as a discrete variable or a continuous

variable.
1 The highest wind speed of a hurricane
2 The weight of baggage on an airplane
3 The number of pages in a statistics book
4 The amount of money a person spends per year for online
purchases

14 / 1
Solutions

1 Continuous: since wind speed must be measured

2 Continuous: since weight is measured
3 Discrete: since the number of pages is countable
4 Discrete: since we can count the money.

15 / 1
Types of Data

In addition to being classified as qualitative or

quantitative, variables can be classified by how they are
categorized, counted, or measured.
For example, can the data be organized into specific
categories, such as area of residence (rural, suburban, or
urban)?
Can the data values be ranked, such as 1st place, 2nd place,
3rd place etc.?
Or are the values obtained from measurement i.e. heights,
IQs, or temperature?
This type of classification, i.e., how variables are
categorized, counted, or measured, uses measurement scale
(also known as Level of Measurement).
16 / 1
Measurement Scale / Levels of Measurement

Levels of measurement of the data, dictates the calculations

that can be done to summarize and present the data.
It also determine the statistical tests that should be
performed in the analysis stage.
The 4 common types of measurement scale used are:
Nominal, Ordinal, Interval, and Ratio.
These levels of measurement are best understood by
examples.

17 / 1
Nominal

Comes from Latin word ”nomen” meaning Name

Data that can be categorized but not ordered/ranked.
The variable of interest can be divided into mutually
exclusive (non overlapping) categories or outcome.
Examples; Gender(Male, female), religion (SDA, RCC,
CCAP), Color (Green, Orange, Yellow), National/region
(malawi, namibia, Zambia) etc.

18 / 1
Ordinal
Data classifications categorisation are represented by labels
or names (high, medium, low) that have relative values.
Because of the relative values, the data classified can be
ranked or ordered.
Though the data can be categorized and ranked/ordered
but the difference between ranks does not exist.
In other words, precise differences between the ranks do
not exist.
the ranks lack the properties that are required to compute
many statistics, such as the average.
Example; grades (A, B, C, D,...), Rating scale (Poor, good,
excellent, ...), satisfaction level, happiness,...
Specifically, there is no clear meaning to the distance
between A and B, or if ranks are coded with numbers, the
difference between 1 and 2 is meaningless.
what would be the distance between Rarely and Never? 19 / 1
Interval
Interval data includes all the characteristics of the ordinal
level.
And precise differences between units of measure do exist
and is a constant size; however, there is no meaningful zero.
Equal differences in the characteristic are represented by
equal differences in the measurements.
Examples include temperature, Scores, IQ,...
The interval between 60◦ C and 70◦ C is the same as the
interval between 20◦ C and 30◦ C.
Since intervals between numbers represent distances, we
can do mathematical operations such as taking an average.
But having no meaningful Zero, i.e. we can’t say that 60◦ C
is twice as warm as 30◦ C or we cannot say a temperature
of 0◦ C means there is no temperature.
20 / 1
Ratio

Ratio data Possesses all the characteristics of interval

measurement.
And there exists a true zero, and represents the absence of
the quantity being measured.
Zero does not have to be observable in the data.
E.g: Newborn babies, cannot have zero weight, but weight
is ratio data. So here, what matters is that the zero is just
an absolute reference point.
In addition, true ratios exist when the same variable is
measured on two different members of the population.
Examples; Height, weight, Salary, Age, time, units of
production, changes in stock prices, distance between
branch offices,...
21 / 1
Time Series Vs Cross sectional Data

Are different in-terms of their use and nature of data

Time-series data considers the same variables over a certain
period of time, whereas cross-sectional data uses different
data for a given point in time.
It means that time-series data are stable, whereas the data
used in the cross-sectional analysis are scattered.

22 / 1
Time Series

Time series data are observations of data that are collected

at specific intervals of time.
Therefore, time-series data may be categorized into hourly,
daily, monthly, quarterly, half yearly, and yearly.
The idea is to check the similarity and differences of data
recorded at different time periods.
Time-series analysis is an analysis to determine the future
pattern by considering the past and present conditions and
then extending the trend to future conditions.
Therefore, in the case of time-series data, the longer the
interval between two data collection times, the better it is
for the future prediction of the outcome

23 / 1
Cross-section Data

Cross-sectional data are observations of multiple subjects

at one point in time.
Cross-sectional data can also be referred to observations of
many different individuals (subjects, objects) at a given
time.
Where each observation belonging to a different individual.
A simple example of cross-sectional data is the gross
annual income for each of 1000 randomly chosen
households in Blantyre for the year 2012.
Cross-sectional data are distinguished from longitudinal
data, where there are multiple observations for each unit,
over time.

24 / 1
Data Collection

Data Collection Strategy: No one best way: decision

depends on:
What you need to know: numbers or stories
Where the data reside: environment, files, people
Resources and time available
Complexity of the data to be collected
Frequency of data collection
Intended forms of data analysis

25 / 1
Rules for Collecting Data.

Use multiple data collection methods

Use available data, but need to know
how the measures were defined
how the data were collected and cleaned
the extent of missing data
how accuracy of the data was ensured

26 / 1
Rules for Collecting Data.

If must collect original data:

be sensitive to burden on others
pre-test, pre-test, pre-test
establish procedures and follow them (protocol)
maintain accurate records of definitions and coding
verify accuracy of coding, data input

27 / 1
Data Collection Tools

Participatory Methods
Records and Secondary Data
Observation
Surveys and Interviews
Focus Groups
Diaries, Journals, Self-reported Checklists
Other Tools

28 / 1
Participatory Methods

Involve groups or communities heavily in data collection

Examples:
community meetings
mapping: this method gives participants freedom to shape
discussion on a given topic with minimal intervention from
researchers.
transect walks: where members of the community walk
through different areas of the community, interviewing
passers-by and drawing a map with observations of
characteristics, risks and existing solutions after the walk.

29 / 1
Community Meetings

One of the most common participatory methods

Must be well organized
agree on purpose
establish ground rules
who will speak
time allotted for speakers
format for questions and answers

30 / 1
Records and Secondary data

Examples of sources:
files/records
computer data bases
industry or government reports
other reports or prior evaluations
census data and household survey data
electronic mailing lists and discussion groups
documents (budgets, organizational charts, policies and
procedures, maps, monitoring reports)
newspapers and television reports

31 / 1
Using Existing Data Set

Key issues to consider: validity, reliability, accuracy, response

rates, data dictionaries, and missing data rates

32 / 1
Advantages/Disadvantages

Advantages: Often less expensive and faster than

collecting the original data again
Disadvantage: There may be coding errors or other
problems. Data may not be exactly what is needed. You
may have difficulty getting access. You have to verify
validity and reliability of data

33 / 1
Observation

One sees what is happening:

traffic patterns
land use patterns
layout of city and rural areas
quality of housing
condition of roads
conditions of buildings
who goes to a health clinic

34 / 1
Observation is helpful when:

need direct information

trying to understand ongoing behavior
there is physical evidence, products, or outputs than can be
observed
need to provide alternative when other data collection is
unfeasible or inappropriate

35 / 1
Ways to Record Information from Observations:

Observation guide: printed form with space to record

Recording sheet or checklist: Yes/no options; tallies, rating
scales
Field notes:least structured, recorded in narrative,
descriptive style

36 / 1
Guidelines for Planning Observations

Have more than one observer, if feasible

Train observers so they observe the same things
Pilot test the observation data collection instrument
For less structured approach, have a few key questions in
mind

37 / 1
Advantages/Disadvantages

Advantage: Collects data on actual vs. self- reported

behavior or perceptions. It is real-time vs. retrospective
Disadvantage: Observer bias, potentially unreliable;
interpretation and coding challenges; sampling can be a
problem; can be labor intensive; low response rates.

38 / 1
Surveys and Interviews

Excellent for asking people about: perceptions, opinions,

ideas
Less accurate for measuring behavior
Sample should be representative of the whole
Big problem with response rates

39 / 1
Modes of Survey

Telephone surveys
Self-administered questionnaires distributed by mail,
e-mail, or websites
Administered questionnaires, common in the development
context
In development context, often issues of language and
translation

40 / 1
Advantage/Disadvantage

Advantage: Best when you want to know what people

think, believe, or perceive, only them can tell you that.
Disadvantage:People may not accurately recall their
behavior or may be reluctant to reveal their behavior if it is
illegal or stigmatized. What people think they do or say
they do is not always the same as what they actually do.

41 / 1
Interviews.

Often semi-structured
Used to explore complex issues in depth
Forgiving of mistakes: unclear questions can be clarified
during the interview and changed for subsequent interviews
Can provide evaluators with an intuitive sense of the
situation

42 / 1
Challenges of Interviews.

Can be expensive, labor intensive, and time consuming

Selective hearing on the part of the interviewer may miss
information that does not conform to pre-existing beliefs
Cultural sensitivity: e.g., gender issues

43 / 1
Focus Group

Type of qualitative research where small homogeneous

groups of people are brought together to informally discuss
specific topics under the guidance of a moderator
Purpose: to identify issues and themes, not just interesting
information, and not ”counts”

44 / 1
Focus Groups are Inappropriate when:

There is a language barriers.

evaluator has little control over the situation
trust cannot be established
free expression cannot be ensured
confidentiality cannot be assured

45 / 1
Advantage/Disadvantage

Advantage: Can be conducted relatively quickly and

easily; may take less staff time than in-depth, in-person
interviews; allow flexibility to make changes in process and
questions; can explore different perspectives; can be fun.
Disadvantage: Analysis is time consuming; participants
not be representative of population, possibly biasing the
data; group may be influenced by moderator or dominant
group members.

46 / 1
The Population

There are two different types of population:

Target Population: Consists of the group of population
units from whom we would like to collect data (e.g. all
students in the Unima)
Study or Survey Population: Consists of the group of
population units from whom we can collect data (e.g. all
students in UNIMA with laptops)

47 / 1
The Population

NOTE: Ideally a sample survey should have collected data from

Target Population but in practice, we collect data from Study
Population due to some constraints.

48 / 1
The Sample

A sample must be:

Unbiased: The chosen sample should be representative of
the entire population of interest. E.g. if we are interested
in the weight of primary school children, we should select a
sample that includes children from a range of primary
school classes and year groups.
Taken from the collect population: The sample should
only contain members of the population of interest. E.g. if
we are interested in the characteristics of primary school
children, the sample should not contain children from
secondary school.

49 / 1
Sampling Methods

Grouped into two categories:

Non-Probability Sampling: Involves non-random
selection based on convenience or other criteria, allowing
you to easily collect initial data.
Probability Sampling: Involves random selection,
allowing you to make statistical inferences about the whole
group.

50 / 1
Non-Probability Sampling

Has the following characteristics:

No sampling frame is used, therefore the chance of someone
being included in the sample cannot be calculated.
Results from the survey can be produced cheaply and
quickly.
Population coverage is poor since it only captures those
that are available to contribute at the time and/or are
interested enough in the subject under investigation;
It is difficult to make estimates of the population from the
sample results and any generalizations that are made must
be treated with caution.
Performing non-probability sampling is considerably less
expensive than probability sampling methods.
51 / 1
Types of Non-probability Sampling

Convenience Sampling: Data is collected from any

willing and available respondent. Examples include
Street corner interviews;
Magazine and newspaper questionnaires; and
Phone-in polls.
The sample is likely to be unrepresentative of the
population, because only those who feel strongly about the
topic are likely to respond and interviewers may only
approach one particular type of respondent, usually those
that they feel comfortable with. Therefore, the results of
the survey may be biased.

52 / 1
Types of Non-probability Sampling

Purposive Sampling:
Read on Purposive Sampling and write down what it
is,when to use it, advantages and disadvantages.

53 / 1
Types of Non-probability Sampling

Quota Sampling: The population is divided into different

groups or classes according to different characteristics of
the population, and some percentage(proportion) of the
different groups in total population is fixed
In Quota sampling, researchers create a sample involving
individuals that represent a population.
Researchers choose these individuals according to specific
traits or qualities.
Quotas are devised to reflect the characteristics of the
population, hence quota sampling attempts to obtain a
more representative sample than convenience sampling, and
therefore more representative sample results should be
obtained.

54 / 1
Quota Sampling Example & Steps

A study to investigate the proportion of those who eat Pizza

and Cake at home.
Steps
Divide the group into subgroups of some characteristics
Identify proportion of these subgroups in the population.
i.e. N = 10, 4 cakes and 6 pizza
Lastly, select subjects to form sample group: i.e. 50% cakes
(n = 2) and 50% pizza (n = 3), hence total sample n = 5

55 / 1
Snow-ball sampling

56 / 1
Advantages & Disadvantages of Non-probability
Sampling

Advantages:
Non-probability sampling techniques are a more conducive
and practical method for researchers deploying surveys in
the real world.
Getting responses using non-probability sampling is
faster(time effective) and more cost-effective than
probability sampling because the sample is known to the
researcher. The respondents respond quickly as compared
to people randomly selected as they have a high motivation
level to participate.
Effective when it is unfeasible or impractical to conduct
probability sampling.

57 / 1
Advantages & Disadvantages of Non-probability
Sampling

Disadvantages:
Lower level of generalization of research findings compared
to probability sampling
Difficulties in estimating sampling variability and
identifying possible bias

58 / 1
Probability Sampling

All members of the study population have known probability of

being included in the sample
Has the following characteristics:
Use a sampling frame from which to select a sample
Select samples at random from the sampling frame.
Therefore every item on the sampling frame has a chance
of being selected and the probability of selection can be
calculated
Select a sample that is more representative of the
population (than non-probability methods) and
Researchers can calculate the accuracy of the survey
estimates

59 / 1
Example Questions

1 What is the distribution of household sizes in Mulanje

district?
2 What proportion of children aged 6 and attending standard
1 in Mangochi sleep under a mosquito net?
3 What is the distribution of ages of University students in
Malawi?

60 / 1
Some Important terms

Target population: Total population about which

information is required, e.g all University students at time
of study
Study population: The set of individuals from which
individuals to be studied will be selected, e.g all those
attending classes during the study period (when data
collection takes place)
Often these are identical or very similar. But not always

61 / 1
Some Important terms Cont...

Population characteristic: The aspect(s) of the

population to be studied, e.g mean age, proportion of
babies who sleep under a net
Sampling units: The persons or groupings used to select
sample members, e.g households
Sampling frame: Set of sampling units, e.g schools in a
village
List: A real list of units in the sampling frame

62 / 1
Some notation

Population size: N
Sample size: n
n
Sampling fraction: f = N

63 / 1
Probability Sampling Methods

1 Simple Random sampling

2 Systematic sampling
3 Stratified sampling
4 Cluster sampling
5 Multi-phase Sampling
6 Multi-stage sampling

64 / 1
Simple Random sampling (SRS)

Each and every member of the study population has the

same chance of being selected into the sample.
The chance is equal to the sampling fraction (f) where
n
f=N .
Requirements:
A list of all members of the sampling frame
Possible methods:
Pieces of paper in a hat / drum
Random digit tables
Use random digit methods in a software package

65 / 1
Replacement

Sample without replacement - once selected a sampling

unit cannot be drawn again
Sample with replacement - after being selected a sampling
unit can still be drawn again (same chance each time)

66 / 1
Simple Random Sample (WITHOUT Replacement)

Step 1: List the N subjects in the study population. This is

the list of the sampling frame.
Step 2: Number entries in the listing from 1 to N
Step 3: Select n random numbers between 1 and N
Step 4: Use the list of the sampling frame to identify each
individual corresponding to the ID numbers selected
Step 5: Locate each and seek their consent to participate
in the survey

67 / 1
Selecting n random numbers using Excel

Use function: RANDBETWEEN(1, N )

Repeat at least n times
Example
Select a SRS of 30 subjects from a population of 500
N = 500
n = 30

68 / 1
Stratified random sampling

Stratification is the process of grouping the units within a

population of interest into homogeneous sub-groups called
strata
All strata should be mutually exclusive, that is that every
unit within the population of interest can only be assigned
to one strata.
Collectively the strata should also be exhaustive so that all
units are covered by one of the strata

69 / 1
Stratified random sampling cont...

A stratified random sample can be chosen by following the steps

below:
Divide the population into groups called strata: The
population should be split into groups according to some
characteristic that is related to the subject of the survey
A sample is selected from within each stratum using SRS
method. We determine the number of units to be selected
from each strata using an allocation method. The methods
of allocation that such as equal, proportional or optimal
allocation.
The samples for each stratum are collated to form the total
sample of the population. This ensures that each stratum
is represented in the sample.

70 / 1
Allocating the Sample among the Strata

Once we have split our population into strata, we need to

work out how many units to sample from each stratum.
There are three methods of allocating a sample of size n
among the different strata - equal allocation, proportional
allocation and optimal

71 / 1
Advantages

1 The results of stratified random samples tend to be more

accurate (have lower variance) since the grouping together
of similar units controls for the variation within strata.
2 The sample obtained through stratification is more
representative of the population
3 Stratification also permits separate analyses on each group,
which researchers may find useful

72 / 1
Disadvantages

1 This method is more costly and difficult to organize, since

it involves splitting the population into different strata and
taking a sample from each stratum
2 There is a danger of splitting the population into too many
small strata. This may mean that some of the strata may
not contain any sample members or the sample may not be
large enough to be spread across all of the strata
3 Sometimes there may be more than one variable that the
survey needs to be stratified by

73 / 1
Systematic Random Sampling

Systematic random sampling

Use the anticipated population size and planned sample
size to determine the sampling fraction f to be used
Determine a sequence in which sampling units are added to
the list, eg entry in a register, order on a route
1
Determine the sampling interval k = f
Randomly select a number between 1 and k
Select this sampling unit
Then select every k ∗ th sampling unit thereafter

74 / 1
Example

Target population: Patients attending the Out Patient

Department (OPD) at QECH
Number of patients expected in study period = 20, 000
Sample size = 200
Sampling fraction f = 1/100; k = 1/f = 100
Select a random number between 1 and 100, say 42
Approach 42nd patient, then 142nd , 242nd etc.

75 / 1
Cluster Random Sampling

Cluster sampling
Used members of the study population are naturally in
groups, called clusters,
e.g villages for residence,
schools for education,
health center catchment areas for health care e.t.c.
Obtain a simple random sample of clusters
Sample members from the selected clusters only
May select only a sub-set of them

76 / 1
Cluster Sampling Example

What proportion of standard 1 students sleep under a mosquito

net in Mangochi district?
Study population: Standard 1 students aged 6 in Mangochi
district
Population size: approximately 3,000
Number of schools = 54
Randomly select 7 schools and obtain data for every
standard 1 student in the chosen schools
7
Final sample size is approximately 3, 000 × = 389
54

77 / 1
Do all members of the study population have known probability
of being included in the sample?
If Yes:
7
probability a school is selected = = 0.13
54
since all students in selected schools are selected this is also
probability a student is selected
Sometimes sampling of clusters uses sampling in proportion
to size

78 / 1
What are the sampling units?
In cluster sampling the primary sampling units are the
clusters
Individuals that make up the clusters are secondary
sampling units
For the standard 1 students e.g:
primary sampling units -schools
secondary sampling units - students

79 / 1
Multistage cluster sampling

80 / 1
The End.

81 / 1

Par Report
No ratings yet
Par Report
5 pages
Introducing Statistics Part 1
No ratings yet
Introducing Statistics Part 1
22 pages
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Statistics Lesson 1
No ratings yet
Statistics Lesson 1
111 pages
RES1N Prefinal Module 4
No ratings yet
RES1N Prefinal Module 4
3 pages
Lesson Plan For Sounds
No ratings yet
Lesson Plan For Sounds
27 pages
Introduction To Statistics: There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
26 pages
Stat Introduction
No ratings yet
Stat Introduction
7 pages
Lecture 1-Introduction To Statistics
100% (1)
Lecture 1-Introduction To Statistics
47 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Basic Statistics Concept Activity No 1.
No ratings yet
Basic Statistics Concept Activity No 1.
5 pages
Basic Statistics: Chapter One
No ratings yet
Basic Statistics: Chapter One
15 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
22 pages
CHP1 Mat161
No ratings yet
CHP1 Mat161
4 pages
Basic Statistical Concepts: Lesson 1
No ratings yet
Basic Statistical Concepts: Lesson 1
33 pages
Lesson 1 - Introduction To Statistics To Share
No ratings yet
Lesson 1 - Introduction To Statistics To Share
51 pages
Lesson 1:: Basic Terminologies in Statistics
No ratings yet
Lesson 1:: Basic Terminologies in Statistics
3 pages
Notes of Statisitcs
No ratings yet
Notes of Statisitcs
30 pages
Statistics and Probability Lesson 1
100% (1)
Statistics and Probability Lesson 1
6 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
01 SASA Lesson 1.1 Introduction
No ratings yet
01 SASA Lesson 1.1 Introduction
38 pages
1data Management Mamw 100
100% (1)
1data Management Mamw 100
84 pages
Statistics Introduction
No ratings yet
Statistics Introduction
26 pages
Chapter One
No ratings yet
Chapter One
34 pages
Lesson 1 Basic Concepts in Statistics
No ratings yet
Lesson 1 Basic Concepts in Statistics
4 pages
Educ 502 1 1
100% (1)
Educ 502 1 1
70 pages
Chapter 1 Data Analysis
No ratings yet
Chapter 1 Data Analysis
18 pages
Chapter 1 The Nature of Probability and Statistics Updated Spring 2023-2024
No ratings yet
Chapter 1 The Nature of Probability and Statistics Updated Spring 2023-2024
38 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
39 pages
Statis
No ratings yet
Statis
19 pages
Definition of Statistics
No ratings yet
Definition of Statistics
4 pages
Statistics Intro Feb 2025
No ratings yet
Statistics Intro Feb 2025
20 pages
Statistics Modules
No ratings yet
Statistics Modules
27 pages
Statistical Analysis With Software Application
100% (1)
Statistical Analysis With Software Application
6 pages
Chapter-1 Data Analysis
No ratings yet
Chapter-1 Data Analysis
14 pages
Quantitative Analysis For Business (A)
No ratings yet
Quantitative Analysis For Business (A)
57 pages
Unit 1
No ratings yet
Unit 1
40 pages
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
100% (1)
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
28 pages
Introduction To Statistical Programming - PPT Week 1 - Introduction To Statistics
100% (1)
Introduction To Statistical Programming - PPT Week 1 - Introduction To Statistics
22 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Basic Statistical Concepts: Lesson 1
No ratings yet
Basic Statistical Concepts: Lesson 1
34 pages
Prob and Stat - Unit1
No ratings yet
Prob and Stat - Unit1
67 pages
Chapter 1 Statisticsnew
No ratings yet
Chapter 1 Statisticsnew
29 pages
Introduction Statistics
100% (1)
Introduction Statistics
23 pages
Business Statistics
No ratings yet
Business Statistics
9 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
3 pages
Lind 19e Chap001 PPT Accessible
No ratings yet
Lind 19e Chap001 PPT Accessible
21 pages
Basic Ideas of Data Management
No ratings yet
Basic Ideas of Data Management
32 pages
Introduction To Statistics: 8/28/2017 Footer Text 1
No ratings yet
Introduction To Statistics: 8/28/2017 Footer Text 1
24 pages
Dr. Nguyen Thi Van Anh Department of Biotechnology-Pharmacology
No ratings yet
Dr. Nguyen Thi Van Anh Department of Biotechnology-Pharmacology
48 pages
Ling Part 1
No ratings yet
Ling Part 1
5 pages
Eco2061 Week 2
No ratings yet
Eco2061 Week 2
68 pages
Lecture 1 Introduction To Biostatistics
No ratings yet
Lecture 1 Introduction To Biostatistics
31 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
45 pages
BEH 260 CH 1 Notes
No ratings yet
BEH 260 CH 1 Notes
17 pages
1 Introduction To Statistics
No ratings yet
1 Introduction To Statistics
89 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
Statistics Note 1to 4 2
No ratings yet
Statistics Note 1to 4 2
25 pages
Attitudes - Definition and Structure - 2
No ratings yet
Attitudes - Definition and Structure - 2
21 pages
3 The Law of Contract
No ratings yet
3 The Law of Contract
33 pages
2024 Appeals in Arguments
No ratings yet
2024 Appeals in Arguments
23 pages
Labour Law Notes
100% (1)
Labour Law Notes
98 pages
Sentence Writing SKills
100% (1)
Sentence Writing SKills
35 pages
Report Writing
No ratings yet
Report Writing
55 pages
Unit 5 With Lessons
No ratings yet
Unit 5 With Lessons
11 pages
Spear Man
No ratings yet
Spear Man
5 pages
Spontaneous Generation Laboratory Activity 1
No ratings yet
Spontaneous Generation Laboratory Activity 1
1 page
Lesson No 5 Elements of Research Design
0% (1)
Lesson No 5 Elements of Research Design
26 pages
Multiple Choice Criteria For A Good Questionnaire
No ratings yet
Multiple Choice Criteria For A Good Questionnaire
7 pages
Inference About Population Variance
100% (1)
Inference About Population Variance
30 pages
Support Material 1
No ratings yet
Support Material 1
40 pages
National Foods Masala Presentation
No ratings yet
National Foods Masala Presentation
12 pages
Research in Education
100% (22)
Research in Education
49 pages
Effect of Advertisement On Male vs. Female Buying Behavior
100% (1)
Effect of Advertisement On Male vs. Female Buying Behavior
87 pages
Bartlett's Test - Definition and Examples - Statistics How To
No ratings yet
Bartlett's Test - Definition and Examples - Statistics How To
3 pages
Prefix 8e6678d81693913130754 - Name W Hide and Seek With Numbers
No ratings yet
Prefix 8e6678d81693913130754 - Name W Hide and Seek With Numbers
4 pages
Link Research Method Book
No ratings yet
Link Research Method Book
1 page
SAGE Handbook of Mixed Methods in Social PDF
No ratings yet
SAGE Handbook of Mixed Methods in Social PDF
26 pages
Interpreting Statistical Results
No ratings yet
Interpreting Statistical Results
17 pages
Chapter 3 Ncy
No ratings yet
Chapter 3 Ncy
6 pages
Laboratory Exercise Hypothesis and Hypothesis Testing: Objectives
No ratings yet
Laboratory Exercise Hypothesis and Hypothesis Testing: Objectives
3 pages
Advertising & Marketing Notes TyBMS
No ratings yet
Advertising & Marketing Notes TyBMS
94 pages
Statistical Significance
No ratings yet
Statistical Significance
16 pages
Individual Assignment 2: Harvested Area Production of Dry Cocoa (Hectare) (Tonne)
No ratings yet
Individual Assignment 2: Harvested Area Production of Dry Cocoa (Hectare) (Tonne)
4 pages
Principles of Experimental Design: Nur Syaliza Hanim Che Yusof Sta340
No ratings yet
Principles of Experimental Design: Nur Syaliza Hanim Che Yusof Sta340
10 pages
First Quarter Examination in Practical
No ratings yet
First Quarter Examination in Practical
2 pages
Statistics Project Report: On Topic "Average Expenses of People of Varanasi On Electricity in Last Month")
No ratings yet
Statistics Project Report: On Topic "Average Expenses of People of Varanasi On Electricity in Last Month")
9 pages
Paired Sample T-Test: Steps
No ratings yet
Paired Sample T-Test: Steps
2 pages
Soal Nomor 2. Syarat Data Berdistrtibusi Normal Adalah Sig Besar Dari 0,005. Dilihat Dari Tabel Test Normallity
No ratings yet
Soal Nomor 2. Syarat Data Berdistrtibusi Normal Adalah Sig Besar Dari 0,005. Dilihat Dari Tabel Test Normallity
2 pages
Statistical Questions
No ratings yet
Statistical Questions
3 pages
Akmal Fahrezi Prak - STTK
No ratings yet
Akmal Fahrezi Prak - STTK
2 pages
Stat W1
No ratings yet
Stat W1
38 pages
Koalisi Aktor Dalam Implementasi Kebijakan
100% (1)
Koalisi Aktor Dalam Implementasi Kebijakan
39 pages

Mse1 Stat Class

Uploaded by

Mse1 Stat Class

Uploaded by

Introduction to Statistics

Malawi University of Business and Applied Sciences

Francis J. Majawa (Malawi University of Business

Statistics is the science of conducting studies to collect,

Population: Are all subjects (human or not) that are

Sample: Is a group of subjects selected from a population.

Statistics is divided into two branches depending on how data

Determine whether descriptive or inferential statistics were

1 Descriptive statistics were used because this is an

A study conducted at Domasi college revealed that students

1 What are the variables under study?

Variables in statistics are classified into two categories.

Can be represented with label. i.e.

Quantitative variables are further grouped into two:

Classify each variable as a discrete variable or a continuous

1 Continuous: since wind speed must be measured

In addition to being classified as qualitative or

Levels of measurement of the data, dictates the calculations

Comes from Latin word ”nomen” meaning Name

Ratio data Possesses all the characteristics of interval

Are different in-terms of their use and nature of data

Time series data are observations of data that are collected

Cross-sectional data are observations of multiple subjects

Data Collection Strategy: No one best way: decision

Use multiple data collection methods

If must collect original data:

Involve groups or communities heavily in data collection

One of the most common participatory methods

Key issues to consider: validity, reliability, accuracy, response

Advantages: Often less expensive and faster than

One sees what is happening:

need direct information

Observation guide: printed form with space to record

Have more than one observer, if feasible

Advantage: Collects data on actual vs. self- reported

Excellent for asking people about: perceptions, opinions,

Advantage: Best when you want to know what people

Can be expensive, labor intensive, and time consuming

Type of qualitative research where small homogeneous

There is a language barriers.

Advantage: Can be conducted relatively quickly and

There are two different types of population:

NOTE: Ideally a sample survey should have collected data from

A sample must be:

Grouped into two categories:

Has the following characteristics:

Convenience Sampling: Data is collected from any

Quota Sampling: The population is divided into different

A study to investigate the proportion of those who eat Pizza

All members of the study population have known probability of

1 What is the distribution of household sizes in Mulanje

Target population: Total population about which

Population characteristic: The aspect(s) of the

1 Simple Random sampling

Each and every member of the study population has the

Sample without replacement - once selected a sampling

Step 1: List the N subjects in the study population. This is

Use function: RANDBETWEEN(1, N )

Stratification is the process of grouping the units within a

A stratified random sample can be chosen by following the steps

Once we have split our population into strata, we need to

1 The results of stratified random samples tend to be more

1 This method is more costly and difficult to organize, since

Systematic random sampling

Target population: Patients attending the Out Patient

What proportion of standard 1 students sleep under a mosquito

You might also like