0% found this document useful (0 votes)
22 views47 pages

Statistics

In a research methods class, two students presented their research problems and outlined their studies. Student A's research examined the effect of organic versus inorganic feeds on fish growth. Student B studied the effect of different soil types on tomato plant height. Both students stated their independent and dependent variables, hypotheses, and identified the appropriate statistical tests to analyze their data - a t-test for Student A and ANOVA for Student B. The document then discussed key concepts in statistics like measures of central tendency, variability, and choosing appropriate statistical tests based on variable types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views47 pages

Statistics

In a research methods class, two students presented their research problems and outlined their studies. Student A's research examined the effect of organic versus inorganic feeds on fish growth. Student B studied the effect of different soil types on tomato plant height. Both students stated their independent and dependent variables, hypotheses, and identified the appropriate statistical tests to analyze their data - a t-test for Student A and ANOVA for Student B. The document then discussed key concepts in statistics like measures of central tendency, variability, and choosing appropriate statistical tests based on variable types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

In a research class, students are tasked to present their

research problems. So that they will be guided on the


flow of the presentation, one must state his/her
problem first, the different variables, the hypothesis,
and the most appropriate statistical tool to be used.
Two students volunteer to present. These are their
outputs.
Student A - Research Problem: Is there a significant effect between the
organic and inorganic feeds on the growth of Oreochromis niloticus?

Student B - Research Problem: Is there a significant effect between the types


of soil used (loam, sandy and clay) on the height of tomato plant?
Student A - Independent variables: organic and inorganic feeds

Student B - Independent variables: types of soil: loam, sandy


and clay
Student A - Dependent Variables: the growth of
Oreochromis niloticus

Student B - Dependent Variables: height of


tomato plant
Student A - Hypothesis: There is no significant difference between
the effect of organic and inorganic feeds on the growth of
Oreochromis niloticus.

Student B - Hypothesis: There is no significant difference between


the types of soil used (loam,sandy and clay) on the height of tomato
plant
Student A - Statistical Tool to use: T-test

Student B - Statistical Tool to use:


o ANOVA
Statistics is playing such an increasingly
important role in almost every aspect of human
endeavor. Its importance ranges from the simple
computation of grades to gathering information
regarding the vaccine for COViD-19. Its
influence has spread to agriculture, medicine,
communication, economics, political science,
sociology, education, and numerous other fields.
In quantitative research a decision must be
made - whether to reject or accept the
hypotheses. Prior to doing so, pertinent
information must be gathered, and a plan
should be conceived on how to deal with
the information gathered. Thus, to give
meaning to this information and interpret
it, statistical methods must be employed.
What is statistics?

Statistics is a collection of planning experiments


methods, obtaining data, analyzing, interpreting,
and drawing conclusions based on the data
(Alferes & Duro2010). It is divided into two
main areas: Descriptive and Inferential.
Descriptive Statistics summarizes or describes the
essential characteristics of a known set of data. For
example, the Department of Health conducts a tally
to determine the number of CoViD-19 cases per day
in the Philippines.
Inferential Statistics, on the other hand, uses sample data to
make inferences about a population. It consists of
generalizing from samples to populations, performing
hypothesis testing, determining relationships among
variables, and predictions. For example, assuming you want
to find out if the Filipinos want to take a shot on the CoViD-
19 vaccine. In such a case, a smaller sample of the
population is considered. The results are drawn, and the
analysis is extended to the larger data set.
Frequency Distribution
A B B AB O
The most convenient way of organizing
data is by constructing a frequency O O B AB B
distribution. A frequency distribution is
a collection of observations produced B B O A O
by sorting them into classes and
showing their frequency or numbers of A A O O A
occurrences in each class. For example,
twenty-five students were given a blood AB O O B AB
test to determine their blood types.
Category Tally Frequency

A IIIII 5

B IIIII – II 7

AB IIII 4

O IIIII - IIII 9
Measures of Central Tendency or Position or Average

When scores and other measures have been


tabulated into a frequency distribution, the next task
is to calculate a measure of central tendency or
central position. This measure of central tendency is
synonymous with the word "average". An average is
a typical value that tends to describe the set of data.
The three main measures of
central tendency is the
Mean
Median
Mode
Mean
or simply the average is the most frequently used
and can be described as the arithmetic average of
all scores or groups of scores in a distribution.
The process can be done by adding all the scores
or data then divided by the total number of cases.
Median
or the middle-most value in a list of items arranged in
increasing or decreasing order. If the case is in an odd
number or items, there will be exactly one item in the
middle. In case the number or items is an even
number, the midpoint will be determined by getting
the average of the two-middle item.
Mode
Finally, the mode is the score or group of scores
that occur most frequently. Some distributions
don't have mode at all. Others may have more
than one mode. In cases that the distribution has
two modes, the term used is bimodal. Below is an
example of how to get the measure of the central
tendency of a distribution.
Laboratory tests reveals the incubation period (measures in days) of virus
among the30 infected residents of brgy. Salay
10 12 10 14 14 13
11 12 14 14 10 12
12 11 13 14 11 12
14 12 12 11 10 10
12 13 12 12 14 14
In dealing with this, arrange the given data from highest to lowest
or vise versa
10 11 12 13 14
10 11 12 13 14
10 11 12 13 14
10 11 12 14
10 12 14
10 12 14
12 14
12 14
12
Use formula, Mean = Mx
------
Mean = 362
N
------ = 12.1
30
Median = Since there are 30 cases, get the 15th and
16th data that is 12 and 12 add them then divide by
2 +12

Mode = 12 since this is the most frequent score


Measures of Variation/ Dispersion

The previous section focused on average or measures


of central tendency. The averages are supposed to be
the central scores of a given set of data, However, not
all features of a given data set may be reflected by the
averages. Suppose, two different groups of 5 Students
are given 20-item identical quizzes in Science. The
following data below were the results.
Group 1 Group 2 Gr. 1 Gr.2
14 5 Mean 14 14
13 19
18 18 Median 14 14
14 14
11 14 Mode 14 14
Average pf each groups
As shown in the second table (The Average), the
two sets of averages have no difference. But both
groups show an obvious difference. Group 2 has
more widely scattered data compared to Group 1.
This characteristic called variability or dispersion
is not reflected by averages. The three basic
measures of dispersion are range, variance, and
standard deviation.
The Range is the simplest measure of dispersion to
calculate. It is done by getting the difference between
the highest/largest value and lowest/smallest value in
a given set of data. A larger range suggests greater
variations or dispersion. On the other hand, a smaller
range suggests lesser variations or dispersion
Variance measures how far a data set is spread out. It is
mathematically defined as the average of the squared differences
from the mean. Standard Deviation is the most commonly used
measure of dispersion. It indicates how closely the values of the
given data set are clustered around the mean. It is computed by
getting the positive square root of variance. The lower value of
standard deviation means that the values of the given set of data are
spread over a smaller range around the mean. On the other hand,
greater value means that the values of the given set of data are
spread over a larger range around the mean.
Statistical tests are used in hypothesis testing.
They can be used to: determine whether a
predictor variable has a statistically significant
relationship with an outcome variable and
estimate the difference between two or more
groups.
Before deciding what statistical tool will be used in
one's study, a knowledge of the types of variables is
essential because it will help you determine what type
of statistical tool is appropriate.
Choose the test that fits the types of predictor or
independent variables and outcome/ dependent
variables you have collected. Statistical tests are
used to derive a generalization about the
population from the sample. A statistical test is a
formal technique that relies on the probability
distribution for concluding the reasonableness of
the hypothesis. These hypothetical testing related
to differences are classified as parametric and
non-parametric tests.
The parametric test is one that has
information about the population
parameter. On the other hand, the non-
parametric test is where the researcher has
no idea regarding the population parameter.
RAMETIC TES
Parametric tests usually have stricter requirements
than non-parametric tests and can make more robust
inferences from the data. They can only be conducted
with data that adheres to the standard assumptions of
statistical tests.The most common types of the
parametric test include regression tests, comparison
tests, and correlation tests. Below is a flowchart that
will help us determine the appropriate statistical tool
for parametric tests.
Predictor
Categorical Variable; Quantitative
Categorical
Or quantitative
Outcome Outcome
Variable; Variable;
Categorical Categorical
Or quantitative Or quantitative
Quantitative
Categorical Quantitative
Categorical
How many
Chi-square predictor
Do composition Logistic variable
Test
of mean tests regression

one
More than one
How many
two groups Multiple
Are being Simple
Regression
compared Regression
T-test
More than two

How many
one Outcome
More than one
variables

ANOVA MANOVA
Example, The Effect of the Amount of Chlorine
in the Color of Algae. Identify first your
independent and dependent variables, how many
are they, and their type, whether qualitative/
categorical or quantitative/ numeric.
After identifying such, look at the diagram above to
know the parametric test's right statistical tool. In the
given problem, the amount of chlorine is the
independent variable, it's numeric or qualitative, and
2 or more amounts of chlorine may be used in the
experiment. The dependent variable is the color of
algae; its categorical and color may vary. So, looking
at the above diagram, logistic regression is the
appropriate tool.
Non-parametric tests don't make as many
assumptions about the data and are useful when one
or more common statistical assumptions are violated.
However, the inferences they make aren't as strong as
with parametric tests. The table below shows how to
determine the appropriate non-parametric tool to be
used.
Statistical PREDICTOR
tool VARIABLE
OUTCOME VARIABLE
Spearman’s r Quantitative Quantitative
Chi-square test of
independence Categorical Categorical
Sign Test Categorical Quantitative
Categorical
Kruskal-Wallis H 3 or more groups Quantitative
Categorical Quantitative
ANOSIM 3 or more groups 2 or more outcome variables

Wilcoxon Rank- Categorical Quantitative


Sum Test 2 groups Groups different populations
Wilcoxon Signed- Rank Categorical Quantitative
Test 2 groups Groups same populations
Statistical tools are complex, especially among
beginners. However, according to Grob man,
2017, the most commonly used in science
investigatory projects are chi-square, t-tests, and
correlations. In determining whether there is no
statistically significant relationship between the
independent and dependent variables, we always
consider the standard rule of thumb.
If the p-value is lower than 0.05, we reject the
null hypothesis and accept the alternative
hypothesis. Licensed Statisticians play a vital
role in computing and interpreting the results of
the data gathered. In any investigation, it is
important to consult them to ensure that your
results are statistically correct. SPSS and Strata
are some of the most common software they are
using

You might also like