100% found this document useful (1 vote)
860 views15 pages

Lecture Notes in MAED Stat Part 1

This document provides an overview of statistics, including: 1. Descriptive statistics such as measures of central tendency (mean, median, mode) and dispersion (range, variance, standard deviation) are used to summarize and describe data. 2. Inferential statistics are used to make inferences about populations based on samples through statistical tests like t-tests, ANOVA, and non-parametric tests. 3. Data can come from primary sources collected by researchers or secondary sources from other published works, and can be qualitative, quantitative, nominal, ordinal, or interval in measurement scale.

Uploaded by

Jose Espino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
860 views15 pages

Lecture Notes in MAED Stat Part 1

This document provides an overview of statistics, including: 1. Descriptive statistics such as measures of central tendency (mean, median, mode) and dispersion (range, variance, standard deviation) are used to summarize and describe data. 2. Inferential statistics are used to make inferences about populations based on samples through statistical tests like t-tests, ANOVA, and non-parametric tests. 3. Data can come from primary sources collected by researchers or secondary sources from other published works, and can be qualitative, quantitative, nominal, ordinal, or interval in measurement scale.

Uploaded by

Jose Espino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

LECTURE NOTES IN STATISTICS –PART 1 (OVERVIEW)

Statistics Define

Statistics define as field of study dealing on collection, presentation (tabulation),


analysis, and interpretation of numerical or quantitative data.

The Role of Statistics in Research


It is used by researchers in many fields to organize, analyze, summarize,
interpret and present data in meaningful and convenient way.
It will enable researcher to give exact descriptions of the collected data in
his research.
It enable researcher to develop accurate and reasonable inferences from
the relevant data he collected.
Knowledge on statistics would enable the researchers and consumers of
research to evaluate the credibility and usefulness of information derived
from data, for them to make appropriate decisions or actions based on the
data collected.

Types of Statistics

Descriptive Statistics
Inferential Statistics

Descriptive Statistics

Includes measures that provide a concise and meaningful summary and


descriptions of data in a study. Data are summarize and described in statistical
tables and in graph.

 Measures of Central Tendency

Mean – is the other term for average. It describes the midpoint of


two extremes, obtained by dividing the sum of numerical quantities
by the number of observations. Mean is commonly used in scaling
the degree or level of something.
Median- is a value separating the higher half and the lower half of a
set of data. It is simply the middle score of all the scores distributed
from highest to lowest.
Mode- the most frequently occurring scores in the distributions or
the value with greatest frequency in the distributions.
 Measures of Dispersion
Range – is the distance between the minimum and maximum scores.
Is the difference of largest and smallest value in the distributions.
Variance- it is the average of squared differences from the mean.
Standard Deviation- a commonly used measure of dispersion. It is
simply the square root of variance. It measures the dispersion of
scores around the mean.

 Frequency and Percentage

Frequency and Percent Distribution – is an organized picture of data


that shows how individual scores are distributed in each category
through its number and proportion of occurrences. These are used as
measures in quantifying and classifying something based on the
selected variables. Used when establishing the distribution of
respondents or subjects of the study based on the selected grouping
variables.

Inferential Statistics

Tests use to make inferences about population using data from


representative samples drawn from the population. Use to make judgment
on the probability that an observed difference or relationship is significant
or not. Use to test hypothesis.
Parametric Tests- use for interval and ratio data, data sets with
normal distribution, homogeneous variance, linear
relationship, and data that are independent.

Non-Parametric Test -use for nominal (categorical) and ordinal


(ranked) data, unless the ordinal data has an average or high
order scale (e.g. Likert with at least 5), and data sets that does
not meet the criterion for parametric tests.

 Types of Parametric Tests and Uses

Independent Samples (Unpaired) T Test – use to compare


means of two independent groups to observe the statistical
significance of difference.

Dependent Samples (Paired Samples) T Test – test used to


compare means of two related groups to observe the
statistical significance of difference. (i.e. pre-test and post test
scores).

Analysis of Variance (ANOVA) – use to compare means to


observe the statistical significance of variance exist between
three or more independent groups.

-One-way ANOVA – also known as single factor since there is


only one independent variable or factor used in comparing.

-Factorial ANOVA- maybe two-way or three-way, there are two


or more independent variables considered to observe their
interaction effects on dependent variable.

-Multivariate ANOVA- a multivariate analysis that allows us to


observe the significance of variances in two or more
dependent variables as effect of one independent variable
when simultaneously examine.
Least Significant Difference Test and Duncan’s Multiple Range
Test (Post Hoc Tests) – multiple comparison tests that are used
to point out where significant difference lies from among
groups or treatments compared.

Pearson Product Moment Correlation Coefficient – is a


measure of the strength of a linear association between two
variables and is denoted by r.

Regression Analysis – Regression analysis generates an


equation to describe the statistical relationship between the
predictor (independent) variable referred to X and the
response (dependent) variable referred to Y. The purpose of
quantifying relationship is to describe relationships and two
predict outcome of X to Y.

 Simple Linear regression -uses one independent variable


to explain and/or predict the outcome of dependent
variable.

 Multiple regressions - (multivariate analysis) uses two or


more independent variables to explain and/or predict
the outcome of dependent variable.

 Types of Non-parametric Tests and Uses

Mann Whitny U (2 Independent Samples) – used to test


difference in ranks of scores of two independent groups. Is
often considered as non-parametric alternative test to the
independent sample t-test.

Kruskal Wallis H (K Independent Samples) – A ranked-based


(continuous ordinal data transformed into ranks) non-
parametric test used to observe the statistical significance
of difference in ranks of scores of three or more
independent groups. It is considered as the nonparametric
alternative to the one-way ANOVA.

Wilcoxon (2 related samples) – used to test the difference


in ranks of scores of two related groups. The Wilcoxon
signed-rank test is the nonparametric test equivalent to the
dependent t-test. It is used to compare two sets of scores
that come from the same participants. This can occur when
we wish to investigate any change in scores from one time
point to another, or when individuals are subjected to more
than one condition.

Friedman (K related samples) – used to test the difference


in ranks of scores of three or more related groups.
Friedman test is the non-parametric alternative to the one-
way ANOVA with repeated measures. It is used to test for
differences between groups when the dependent variable
being measured is ordinal. It can also be used for
continuous data that has violated the assumptions
necessary to run the one-way ANOVA with repeated
measures (e.g., data that has marked deviations from
normality.

Chi-Square (χ2) Test of Independence - test used to observe


the significance of association of two categorical data
displayed in contingency table.

Spearman Rank-Difference Correlation Coefficient – is the


non-parametric version of Pearson Product Moment
Correlation. Test used to observe strength and direction of
association of two ranked variables.
Sources of Data

 Primary data – first-hand data collected by the investigator himself.


 Secondary data- from what published or compiled by the researcher,
organization, research institution and other agency.

Types of Data

 Qualitative data- is a categorical measurement expressed not in


terms of numbers, but rather by means of a natural language
description. In statistics, it is often used interchangeably with
"categorical" data.
 Quantitative data- is a numerical measurement acquired through
counting or measuring.

Measurement Scales

 Nominal – a categorical, numbers are simply used as identifiers or for


classification or identification purposes (e. g. 1 – male and 2 –
female). It does not have quantitative meaning. When measuring
using a nominal scale, one simply names or categorizes responses.
Gender, handedness, favorite color, and religion are examples of
variables measured on a nominal scale. The essential point about
nominal scales is that they do not imply any ordering among the
responses. For example, when classifying people according to their
favorite color, there is no sense in which green is placed "ahead of"
blue.

 Ordinal – a ranked data use to classify and order the classes.


Examples are values categorically ordered such as: (5) Very Highly
Satisfied; (4) for Highly Satisfied; (3) for Moderately Satisfied; (2) for
Dissatisfied; and (1) Very Dissatisfied. A researcher wishing to
measure consumers' satisfaction might ask them to specify their level
of satisfaction ranging from least to most. Unlike nominal scales,
ordinal scales allow comparisons of the degree, for example, our
satisfaction ordering makes it meaningful to assert that one person is
more satisfied than another.
 Interval – is a measurement where the difference between two
values is meaningful. Interval scales are numeric scales in which we
know not only the order, but also the exact differences between the
values. In particular, they do not have a true zero point.

 Ratio – a product of highest level of measurement has meaningful


zero and therefore provide absolute magnitude of attribute. For
example, height, weight, length, area, speed, and velocity are
measured on a ratio level scale.

Methods of Gathering Data

 Interview - a person to person exchange of information. Uses


interview schedule or guide questionnaire.
 Questionnaire – set of prepared questions to be answered by the
respondents.
 Experimentation (data are taken from test result)
 Observation – recording the behaviour or conditions of a thing,
person or organism as they naturally occur.

Methods of Presenting Data

 Textual – narrative form


 Tabular – arranging or summarizing data in statistical table
 Graphical – use of graph such as pie, bar, line, pictograph and other
graphical illustrations.

Sampling

 Sampling – is the process of taking a subset of subjects that is


representative of the entire population. It is important to consider
how the sample is selected to make sure that it is unbiased and
representative of the population. Sampling procedure has a
significant impact on the quality of research results/findings. The size
of our sample dictates the amount of information we have and
therefore, in part, determines the precision or level of confidence in
our sample estimates. It affects the credibility of results or findings of
the study.

Sampling Techniques

 Probability - is any method that utilizes some form of random


selection. Random sampling is the process or procedure of selecting
samples without bias or predetermines choice. This assures equal
chance to each unit in the population of being selected.
 Non-Probability - samples are selected based on the subjective
judgement of the researcher, rather than random selection.

Types of Probability Sampling

 Simple Random – is a technique that provides equal probability to


each member of the population to be chosen as sample. One way of
obtaining simple random sample is by lottery.

 Systematic Sampling –is a statistical method involving selection of


samples from ordered sampling frame (population list). In systematic
sampling the researcher randomly picks the first item or subject from
the population. Then the researcher will select each nth subject from
the list. This requires draw lot to determine the random start.

 Stratified Sampling – a probability sampling technique wherein the


researcher divides the entire population into different subgroups or
strata, then randomly select the final subjects proportionally from
different subgroups.

 Cluster Sampling – Cluster sampling is a sampling technique used


when "natural" but relatively homogeneous groupings are evident in
a statistical population. It is often used in marketing research. In this
technique the researcher divides the population into separate groups
or clusters (street, block, and district). Then, one or more clusters are
selected at random and everyone within the chosen cluster is
sampled. The main reason of cluster sampling is economy and
feasibility.

 Multi-stage Sampling – It is known as 'multistage' because there are


multiple stages, or steps done, to obtain the samples. With multi-
stage sampling, samples are selected using combinations of different
sampling methods. It could also be referred to sampling plan where
samples are finally obtained from the smallest sampling units after
several random stages are carried out. Example from a region select
provinces (first stage) and from selected provinces select a
cities/municipalities (second stage) and from municipalities select
barangays where samples are finally selected (third stage). The
reasons are for convenience, economy, and efficiency.

Types of Non-probability Sampling Designs

1. Quota Sampling – in which each investigator is instructed to collect


information from a predetermined number of individuals (the quota) but
the selection of individuals is left to the investigator’s choice.
2. Snowball Sampling - in this type of sampling, the researcher asks the
initial subject to identify another potential subject who also meets the
criteria of the research. It is usually done when there is a very small
population size. The downside of using a snowball sample is that it is
hardly representative of the population.
3. Convenience sampling - with convenience sampling, the samples are
selected because they are accessible to the researcher. Subjects are
chosen simply because they are easy to recruit. This technique is
considered easiest, cheapest and least time consuming.
4. Consecutive sampling - is very similar to convenience sampling except
that it seeks to include ALL accessible subjects as part of the sample. This
sampling technique can be considered as the best of all non-
probability/non-random samples because it includes all subjects that are
available that makes the sample a better representation of the entire
population.
5. Judgmental/Purposive sampling - in this type of sampling, subjects are
chosen to be part of the sample with a specific purpose in mind. With
judgmental sampling, the researcher believes that some subjects are fit
for the research compared to other individuals. This is the reason why
they are purposively chosen as subjects.

Determining Sample Size

Mathematical Equation

n = N
1+Ne2
Where:

n = sample size

N = population size

e = 0.05 (sampling error)

Therefore if we have population size of 1,000 the sample size is 286.

Population Size – 1,000

n = 1,000
1+1,000 (.05)2

= 286
Determining Proportionate Samples

Mathematical Equation

n1 = N1 [ n ]
Nt

Where:

n1 - total size of the stratified random samples


Nt - total population
N1 - population size of the first group

Sample Situation

Population size of the first group (N1) - 119


Population size of the second group (N2) - 210
Population size of the third group (N3) - 325
Population size of the fourth group (N4) - 346

Computations

n1 = 119 (286) = 34
1,000

n2 = 210 (286) = 60
1,000

n3 = 325 (286) = 93
1,000

n4 = 346 (286) = 99
1,000
Compute the total and proportionate sample size of the following situation use
.05 margin of error:

First Year – 251


Second Year – 159
Third Year – 405
Fourth Year- 350

Validity of Research Instrument

 Validity - is the extent to which a research instrument measures what


it intends to measure. Validity in data collection means that your
findings truly represent the phenomenon you are claiming to
measure. Validity measures whether or not the survey has the right
questions in order to answer the research questions that it aims to
answer. Validity make sure that the test or questionnaire that is
prepared actually covers all aspects of the variable that is being
studied (content validity). It measures the degree to which a test
measures what it claims, or purports, to be measuring.

Evaluating the Validity of Research Instrument

 Steps:
1. Construct a questionnaire
2. Administer validity test by subjecting the questionnaire to the
judgement of four or five experts of field of study using Good and
Scate criteria.
3. Compute the mean scores of the ratings given by four or five
jurors.
4. See the verbal descriptions of the mean score in the rating scale
used.
Rating Scale:

4.20-5.00 - Very Good


3.40-4.19 - Good
2.60-3.39 - Average
1.80-2.59 - Poor
1.00-1.79 - Very Poor

 Sample Situation

Mean Scores Obtained from the Four Jurors


Juror 1 – 3.89
Juror 2 – 3.90
Juror 3 - 4.0
Juror 4 – 3.67
15.46 / 4
Overall Mean = 3.865
Verbal Description: The validity index of questionnaire is verbally
described as good.

Reliability of Research Instrument

 Reliability – means the quality of being reliable, dependable or


trustworthy. It is the degree to which an assessment tool produces
stable and consistent results. A research instrument is reliable when
it yields the same results on repeated trials and consistency of
answers of respondents on the given test items (internal
consistency). Reliability tests provide researchers the assurance that
their data are not the result of spurious (not valid or well-founded)
causes. The degree of reliability determines if questions are written
objectively and easily be understood by the respondents. Why is this
important to consider? This is important since unreliable data can
lead to wrong research results.

 Causes of Unreliability

Measuring instruments may malfunction, be influenced by


irrelevant circumstances of their use, or be misread. Content analysts
may disagree on the readings of a text. Coding instructions may not
be clear. The definitions of categories may be ambiguous (having
more than one meaning) or do not seem applicable to what they are
supposed to describe. Coders may get tired, become inattentive to
important details, or are diversely prejudiced.

 Improving the Reliability of Research Instrument

1. Writing items clearly.


2. Making instructions easily be understood.
3. Adherence to proper test administration.
4. Providing consistent scoring.
 Tests Employed to Examine the Reliability of Research Instrument

1. Test-retest - Test-retest reliability is a measure of reliability


obtained by administering the same test twice over a period of
time to a group of individuals’ yields similar results. The data is
analysed using appropriate test of correlation to determine
consistency.

2. Cronbach’s Alpha - is an estimate of reliability specifically


internal consistency of a test scale. It is used to measure how
closely or related are the answers of respondents on test items
thus determines the degree of consistency. When answers are
closely related to one another, Cronbach alpha will be closer to
1, and when answers are not closely related to one another,
Cronbach alpha will be closer to 0.

3. Kuder-Richardson (KR 21) - involved only 1 test administration,


it assesses the consistency of result across items within a test.
Applied to multiple choice, short answer test and fill-in the
blank types of test.

Stages of Data Collection and Statistical Procedures

 Formulation of the statement of the problem and hypothesis.


 Making of questionnaire
 Validity test administration for questionnaire
 Reliability test administration for questionnaire
 Identifying the target population
 Sampling technique and determination of sample size
 Conduct of data collection
 Data consolidation and statistical analysis
 Presentation of findings

(End of Part 1)

You might also like