Science 3B Anachem Lab

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

UNIVERSITY OF CALOOCAN CITY

Biglang Awa St., Corner Catleya St., EDSA, Caloocan City


COLLEGE OF EDUCATION

A. Measure of Central Tendency


1. Ungrouped data
a. Mean
Mean pertains to the average of given numbers and is calculated by
dividing the sum of given numbers by the total number of numbers.

Procedure:
1. Add all the given values
2. Divide the sum of the given values to the total number of values

Application:
Getting the mean is used in a wide array of purposes such as in
computing grades, budgeting finances, and as well as in sports analytics.

b. Median
Median is the middlemost value of the given ungrouped data if the data
is arranged in ascending order.

1 | Page
Procedure:
1. Arrange the given values/data in ascending order.
2. If dealing with an odd number of total values, add one (1) to the total
number of values.
3. Divide the answer by the 2
4. The value matching that term is the median
5. If dealing with an even number of total values, add the value in the two
terms stated in the formula.
6. Divide the sum by 2
7. The answer will be the median.

Application:
Median is commonly used in computing income and as well as costing for
houses in a certain place it gives data about what is the typical range of values
which helps make decisions easier.

c. Mode
Mode is the value that appears most in the data

Procedure:
1. Observe which value appears the most in the given data of numbers.
2. The one with the most appearances will be the mode.

Application:

2 | Page
Mode is very useful in business such as determining which products sell the
most and what are the preferences of the customers. Products which likely
appear to be purchased and inquired more frequently gives the producers great
data which will be helpful in production.

2. Grouped data
a. Mean
The mean of grouped data is the average value, calculated by summing
up all the values in the data set and dividing by the total number of
observations.

where,
f = frequency
x = the class mark or the midpoint of the class interval
N = the number of observations/ scores

Procedure:
1. Find the middle value of each group: Average the lower and upper limits
of each class interval.
2. Multiply each middle value by the number of data points in its group.
3. Add up all the values obtained in the previous step.
4. Add up the total number of data points.
5. Divide the total from step 3 by the total from step 4 to get the mean.

b. Median
The class interval that contains the ( 2 )th score is the median class. It is
𝑛

also the class where the less than cumulative frequency (< 𝑐𝑓) is equal to or
exceeds ( 2 ) for the first time. This can be located under the <cf of the
𝑛

cumulative frequency distribution.

To approximate the median from a frequency distribution, use the formula

3 | Page
( )
𝑛
( 2 )−< 𝑐𝑓
𝑚−1
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑙𝑐𝑏𝑚𝑜 𝑓𝑚
(𝑐)

where,
median class = the class where the <cf is equal to or exceeds ( 2 ) for the first time
𝑛

𝑙𝑐𝑏𝑚= the lower class boundary of the median class


c = the class size
n = the total number of observations in the distribution
𝑓𝑚 = the frequency of the median class
< 𝑐𝑓𝑚−1 = the less than cumulative frequency of the class preceding the median
class

Procedure:
1. Identify the interval that involves the median value by adding all the
frequencies then divided by 2.
2. Once the interval is calculated, start adding the first interval to obtain
cumulative frequency for that interval.
3. Continue adding the frequencies of the following intervals.
4. Use the formula for finding the median by substituting the values.

c. Mode

𝑀𝑜𝑑𝑒 = 𝑙𝑐𝑏𝑚𝑜 ( ∆1
∆1 + ∆2 )𝑐

where,
𝑙𝑐𝑏𝑚𝑜 = the lower class boundary of the modal class
∆1 = the difference between the frequency of the modal class and the
frequency of the next lower class
∆2 = the difference between the frequency of the modal class and the
frequency of the next upper class
c = is the class size

4 | Page
The modal class is the class interval with the highest frequency.

Procedure:
1. Identify the class interval with the highest frequency. (This interval is called
“modal class”)
2. Calculate the Lower Class Boundary of Modal Class (𝑙𝑐𝑏𝑚𝑜). This is the
lower boundary of the modal class interval.
3. Calculate the ∆1 and ∆2:
○ ∆1 - frequency of the class preceding the modal class.
○ ∆2 - frequency of the class succeeding the modal class.
4. Find the class size of the modal class interval.
5. Calculate the value using the formula.

Applications:
Grouped data makes big datasets more manageable by dividing them
into smaller groups or intervals. This makes it easier to conduct statistical analysis
and make decisions in a variety of fields. It facilitates the discovery of patterns
and trends in the data, assisting in forecasting and guiding strategic choices.
Moreover, summarized group data supports quality control and process
optimization initiatives by monitoring and enhancing processes. Additionally, it
facilitates exploratory data analysis in research and streamlines data
visualization, which facilitates sharing findings with stakeholders.

Offline Tools:
When calculating measures of central tendency for grouped data,
statistical programs such as SPSS and R, Python with the NumPy and Pandas
libraries, and Microsoft Excel are common offline tools. These tools are flexible
choices for data analysis because they provide a variety of statistical analysis
functions.

Online Tools:
The online calculators on GraphPad, Desmos, and Wolfram Alpha can all
be used to determine the central tendency of grouped data. These tools satisfy
different needs and preferences for statistical analysis by offering easy-to-use
solutions for analyzing grouped data.

B. Weighted Mean
● The weighted average is also known as the weighted mean.
● It serves as a concise representation of all the quantities within the dataset.

5 | Page
● It assigns varying levels of significance to the values within a dataset.

Formula:

where,
x̄ = the mean value of the set
of given data
w = corresponding weight for each observation
x = represents each individual value in the dataset

Procedure:
1. Compute the weighted value of each data point by multiplying its value with its
corresponding weight.
2. Total the weighted values by adding up the weighted value of each data point.
3. Divide the total by the sum of the weights assigned to each data point.

Application:
● Weighted means are commonly applied in everyday situations, like when a
student calculates their course grade by multiplying each assessment's weight by
their corresponding grade.
● Analyzing the average prices of goods bought from various suppliers based on
the quantity purchased provides businessmen with valuable insights into their
expenses.
● When it comes to a consumer's decision to buy a product relies on various
aspects such as product quality, familiarity, price, and franchise service. By
assigning significance to each aspect and calculating a weighted mean, the
consumer can better decide whether to make the purchase.
● When hiring for a position, the interviewer evaluates qualities like personality,
work capabilities, educational background, and teamwork abilities. Each aspect

6 | Page
is assigned varying levels of importance, and the final decision is made based on
this weighted evaluation of the candidate's profile.

Offline Tools for Standard Deviation:


1. Microsoft Excel where it utilizes functions like SUMPRODUCT and SUM for effortless
weighted mean calculations by inputting data into a spreadsheet.
2. Scientific and statistical calculators it offer basic arithmetic operations and may
include functions for summing products, essential for computing weighted
means.

Online Tools for Standard Deviation:


1. Google Sheets that similar to Excel, input data into a spreadsheet and use built-in
functions for weighted mean computation, accessible online and for
collaborative work.
2. Online weighted mean calculators where you can input data and weights for
instant calculation of the weighted mean; search online to find a suitable tool.

C. Measures of Variability
1. Range
● difference between the highest and lowest values in a set of data.
● range is useful when you need a quick and simple measure to understand
the spread of data.
● is one of four common measures for variability, which describes how far
apart data points are from each other and from the center of a
distribution.

Formula:
Range = Highest Value - Lowest Value

● For example, if the data set is 12, 14, 17, 19, 19, 22, the highest value is 22
and the lowest value is 12, so the range is 16.

Application:
In statistics, range is a commonly used measure of variability that
represents the spread of observations. It can be used to:
● Analyze how the data is spread
● Get an idea about the variability of the data
● Compare small data sets having the same size
● Quality control analysis

7 | Page
Limitation:
● Range can be misleading because it can be significantly affected by
outliers or fluctuations in the data.
● For example, in the data set {8, 11, 5, 9, 7, 6, 3616}, the lowest value is 5,
and the highest is 3616.

Online and Offline Applications:


● Microsoft Excel and other offline document editing applications -
Microsoft is one of the well known apps in organizing numerical data, and
in modern times, some applications are made for either android or other
gadgets to perform simple calculations of statistics and data organization
like WPS.
● Google Sheet - Google Sheet is one of the newest applications that can
be publicly used, Google Sheets enable people to create collaboratively
and freely without interaction from one to another.

2. Variance
● A statistical measure that tells us how spread out a set of data is from its
average value.

Formula
σ² = Σ(x - μ)² / N

where:
● Σ = sum
● x = individual value
● μ = mean
● N = total number of values

Example: Suppose we have the data set {3, 5, 8, 1} and we want to find the
population variance. The mean is given as (3 + 5 + 8 + 1) / 4 = 4.25. Then by using
the definition of variance we get [(3 - 4.25)2 + (5 - 4.25)2 + (8 - 4.25)2 + (1 - 4.25)2]
/ 4 = 6.68. Thus, variance = 6.68.

Limitations:
● Variance is a useful tool for analyzing the spread of a dataset, but it has
limitations. These limitations include the sensitivity to outliers, the
assumption of normal distribution, and the sensitivity to the units of
measurement. It is essential to be aware of these limitations when using
variance to analyze data.

8 | Page
Applications:
● In finance, it helps measure investment risk.
● In quality control, it helps monitor manufacturing consistency.

Online and Offline applications:


● Microsoft Excel and Google Sheet can perform both online and offline.
This two spreadsheet programs have built-in function (VAR.P for
population variance) to calculate variance

3. Standard Deviation:
● Within statistics, standard deviation serves as a quantitative measure of
the dispersion of a data set relative to its central tendency, specifically the
mean.
● A lower standard deviation signifies a higher degree of proximity between
the data points and the mean. While, a higher standard deviation
indicates a greater level of dispersion within the data, with values spread
out over a wider range.
● It plays a crucial role in outlier detection. It establishes a threshold for
identifying data points that deviate significantly from the expected range
established by the mean and standard deviation.
● It can be represented through σ (sigma), or simply 's'.

Formula:

Applications:
● Standard deviation and the mean are frequently used in finance and
economics to measure risk. It helps to determine how much a set of data
values varies from the mean value.
● In manufacturing, the standard deviation is used in conjunction with
sample size to monitor and control the quality of products.

9 | Page
● Standard deviation is used in statistical inference to determine the
significance of differences between two groups of data values.
● Standard deviation is used in research studies to describe the variability of
data and the degree of uncertainty in the results.
● In education, the standard deviation is used to evaluate the performance
of students. It helps to determine whether a student's test score is within
the acceptable range of variability or whether it needs improvement.

Offline Tools for Standard Deviation:


● Calculator: Most scientific calculators have a standard deviation function
(often denoted by SD or σ). Simply input your data points and press the
designated button for standard deviation.
● Spreadsheet Software: Spreadsheet programs like Microsoft Excel or
Google Sheets have built-in functions for calculating standard deviation.
Use the STDEV function for a sample or the STDEVP function for the entire
population.

Online Tools for Standard Deviation:


● Standard Deviation Calculators: Numerous websites offer free standard
deviation calculators. These online tools often allow you to input a large
dataset and may provide additional features like variance calculation
and step-by-step solutions.

D. Hypothesis Testing
is a systematic procedure for deciding whether the results of a research study
support a particular theory which applies to a population. Hypothesis testing uses
sample data to evaluate a hypothesis about a population.

1. Parametric Test
● Statistical methods which depend on the parameters of populations or
probability distributions. Proper interpretation of parametric tests based on
normal distribution also assumes that the scene being analyzed results from
measurement in at least an interval scale.

a. Independent Sample T-Test


● The independent samples t-test is used to compare two sample means
from unrelated groups. This means that there are different people
providing scores for each group. The purpose of this test is to determine if
the samples are different from each other.
● It compares the mean scores of two different groups of people or
conditions.

10 | Page
Assumptions for Use of Independent-Samples T-byTest
1. The sample data have been randomly sampled from a population.
2. There is homogeneity of variance (i.e., the variability of the data in each
group is similar).
3. The distribution is approximately normal.

Formula
Where;
X̄1- the mean of the first sample.
X̄2- the mean of the second sample.
μ1- the mean of the first population.
μ2- the mean of the second population.
s1- the standard deviation of the first sample.
s2- the standard deviation of the second sample
n1- is the size of the first sample, n2 is the size of the second
sample.

Illustrative Example
● We would like to compare the effectiveness of two painkillers, drug A and
drug B. To do this, we randomly divide 60 test subjects into two groups. The
first group receives drug A, the second group receives drug B. With an
independent t-test we can now test whether there is a significant
difference in pain relief between the two drugs.

Practical Applications
● We use the t-test for independent samples when we want to compare the
means of two independent groups or samples. We want to know if there is
a significant difference between these means.
● We are comparing scores from two teaching methods. We drew two
random samples of students. Students in one group learned using Method
A while the other group used Method B. These samples contain entirely
separate students.

Tools
Online Tools:
1. Google Sheets : Google Sheets can handle basic t-tests with add-on
functionalities
Offline Tools

11 | Page
1. Microsoft Excel (with Data Analysis ToolPak): Excel's Data Analysis ToolPak
(needs activation) offers a T.TEST function for independent samples. It
requires manual data input and provides basic output with p-values

b. Paired T-Test
● A paired t-test (also known as a dependent or correlated t-test) is a
statistical test that compares the averages/means and standard
deviations of two related groups to determine if there is a significant
difference between the two groups.

Assumptions of a Paired T-Test


1. The dependent variable is normally distributed.
2. The observations are sampled independently.
3. The dependent variable is measured on an incremental level, such as
ratios or intervals.
4. The independent variables must consist of two related groups or matched
pairs.

Formula
Where;
d- is the difference per paired value
n- is the number of samples

Illustrative Examples
● A fitness instructor wants to see if a new exercise program helps
participants lose weight. They measure the weight of 10 individuals before
and after the program (in weeks).We can perform a paired t-test to
determine if the average weight loss (difference) is statistically significant.
This test assumes the differences between before and after weights are
normally distributed.

Practical Application
● In medical research we compare the effectiveness of two different
medications for a particular condition by measuring the response before
and after each medication on the same group of patients.

12 | Page
Tools
Online Tools
1. GraphPad QuickCalcs - T Test Calculator
(https://www.graphpad.com/quickcalcs/ttest2/): This online calculator
from GraphPad provides a straightforward way to perform paired t-tests.
Enter your data in various formats (raw data, means and standard
deviations) and choose the paired t-test option. It offers basic output with
p-values and some additional statistics.
Offline Tools
1. SPSS (Statistical Package for the Social Sciences): This commercial
software offers a dedicated paired samples t-test function.

c. One-way ANOVA
● is a statistical method for testing for differences in the means of three or
more groups. One-way ANOVA is typically used when you have a single
independent variable, or factor, and your goal is to investigate if
variations, or different levels of that factor have a measurable effect on a
dependent variable.

Assumption for the use of One-way ANOVA


1. The responses for each factor level have a normal population distribution.
2. These distributions have the same variance.
3. The data are independent.

Formula:

Illustrative Example
● Imagine you work for a company that manufactures an adhesive gel that
is sold in small jars. The viscosity of the gel is important: too thick and it
becomes difficult to apply; too thin and its adhesiveness suffers. You've
received some feedback from a few unhappy customers lately
complaining that the viscosity of your adhesive is not as consistent as it
used to be. You've been asked by your boss to investigate.

Practical Application
● A one-way ANOVA is used to determine how one factor impacts a
response variable. For example, we might want to know if three different
groups of students have different mean GPAs. To see if there is a

13 | Page
statistically significant difference in mean GPAs, we can conduct a
one-way ANOVA.

Tools
Online Tools:
1. One-Way ANOVA Calculator to compare the means of three or more
independent samples (treatments) simultaneously.
Offline Tools
1. One-Way Analysis of Variance With Excel is a tool used to partition the
observed variance in a particular variable into components attributable
to different sources of variation.

d. Two-way ANOVA
● is used to estimate how the mean of a quantitative variable changes
according to the levels of two categorical variables. Use a two-way
ANOVA when you want to know how two independent variables, in
combination, affect a dependent variable.

Assumption for the use of Two-way ANOVA


1. Normality – The response variable is approximately normally distributed for
each group.
2. Equal Variances – The variances for each group should be roughly equal.
3. Independence – The observations in each group are independent of
each other and the observations within groups were obtained by a
random sample.

Formula

Illustrative Example
● A botanist wants to know whether or not plant growth is influenced by
sunlight exposure and watering frequency. She plants 40 seeds and lets
them grow for two months under different conditions for sunlight exposure
and watering frequency.

Practical Application
● A two-way ANOVA is used to determine how two factors affect a
response variable and whether or not the two factors interact with the

14 | Page
response variable. For example, we might want to know how different
types of food and how different levels of metabolism impact average
weight loss.

Tools
Online Tools:
1. Two Way ANOVA Calculator we try to determine if the difference
between the averages reflects a real difference between the groups, or is
due to the random noise inside each group
Offline Tools
1. Two Way ANOVA with Google Sheets as XL-miner is used to determine
whether or not there is a statistically significant difference between the
means of three or more groups in which the same subjects show up in
each group.

2. Non-Parametric Tests
● are experiments that do not require the underlying population for assumptions. It
does not rely on any data referring to any particular parametric group of
probability distributions.

a. Wilson-Signed Rank Test


● can refer to either the rank sum test or the signed rank test version, is a
nonparametric statistical test that compares two paired groups. The tests
essentially calculate the difference between sets of pairs and analyze
these differences to establish if they are statistically significantly different
from one another.

Assumption for the use of Wilson-Signed Rank Test


1. The population distribution of the difference scores is symmetric
2. Sample of difference scores is a simple random sample from the
population of difference scores. That is, difference scores are
independent of one another

Formula

15 | Page
Illustrative Example
● For example, you could use a Wilcoxon signed-rank test to understand
whether there was a difference in smokers' daily cigarette consumption
before and after a 6 week hypnotherapy programme (i.e., your
dependent variable would be "daily cigarette consumption", and your
two related groups would be the cigarette consumption values "before"
and "after" the hypnotherapy programme).

Practical Application
● A Wilcoxon test is used when we have two interval or ratio level variables
measured for a set of observations and we want to test if the distribution is
different for the two variables but we are unable to assume normality for
one or both of the variables.

Tools
Online Tools
1. Wilcoxon Signed-Ranks Table: When the requirements for the t-test for two
paired samples are not satisfied, the Wilcoxon Signed-Rank Test for Paired
Samples non-parametric test can often be used.
Offline Tools
1. Wilcoxon Signed-Rank Test in Microsoft Excel through Excel's menu,
making it easy to find any tool you need. You will find the statistical tools
and templates on the far left side of QI Macros menu.

b. Mann-Whitney U Test
● is the non-parametric alternative test to the independent sample t-test. It
is a non-parametric test that is used to compare two sample means that
come from the same population, and used to test whether two sample
means are equal or not.

Assumptions of the Mann-Whitney


1. Mann-Whitney U test is a non-parametric test, so it does not assume any
assumptions related to the distribution of scores. There are, however,
some assumptions that are assumed
2. The sample drawn from the population is random.
3. Independence within the samples and mutual independence is assumed.
That means that an observation is in one group or the other (it cannot be
in both).
4. Ordinal measurement scale is assumed.

Formula

16 | Page
Where:
U=Mann-Whitney U test
N1 = sample size one
N2= Sample size two
Ri = Rank of the sample size

Illustrative Example
● Imagine you are a researcher studying the effects of a new teaching
method on student performance. You have two groups: Group A, which
was taught using the traditional teaching method, and Group B, which
was taught using the new method. You want to determine if there is a
significant difference in the test scores between these two groups.

Practical Application
● Mann-Whitney U test is used for every field, but is frequently used in
psychology, healthcare, nursing, business, and many other disciplines. For
example, in psychology, it is used to compare attitude or behavior, etc. In
medicine, it is used to know the effect of two medicines and whether they
are equal or not. It is also used to know whether or not a particular
medicine cures the ailment or not. In business, it can be used to know the
preferences of different people and it can be used to see if that changes
depending on location.

Tools
Online Tools
1. Online Statistics Calculator
(https://datatab.net/statistics-calculator/hypothesis-test/mann-whitney-u-
test):you can easily calculate the Mann-Whitney U test online. There is
nothing to install or download, you can start right from this page!
Offline Tool
1. Excel using XLSTAT: Once XLSTAT-Pro is activated, select the XLSTAT /
Nonparametric tests / Comparison of two samples (Wilcoxon,
Mann-Whitney, ...) command.

c. Spearman Correlation
● The Spearman correlation test examines whether two variables are
correlated with one another or not. The Spearman’s test can be used to
analyze ordinal level, as well as continuous level data, because it uses
ranks instead of assumptions of normality.

17 | Page
Formula
Where:
Ρ = (rho) represents the Spearman correlation coefficient.
𝑑 = d is the difference between the ranks of corresponding
variables.
𝑛 = is the number of observations.

Illustrative Example
● Imagine you're looking at the study habits and test scores of 5 students. A
perfect positive correlation would be if the student who studied the most
(highest number of hours) also got the highest score, and the student who
studied the least got the lowest score.

Practical Application
● The Spearman's test can be used to analyse ordinal level, as well as
continuous level data, because it uses ranks instead of assumptions of
normality. This makes the Spearman correlation great for 3, 5, and 7-point
likert scale questions or ordinal survey questions.

Tools
Online Tools
1. Spearman's Rho Calculator
(https://www.socscistatistics.com/tests/spearman/default2.aspx): online
Spearman's Rho (correlation coefficient) calculator (offers scatter
diagram, full details of the calculations performed, etc).
Offline Tools
1. Microsoft Excel or Google sheet: have built-in functions for calculating
Spearman's rank correlation coefficient. In Excel, you can use the CORREL
function along with the RANK function to obtain the Spearman
correlation.

d. Kruskal -Wallis H Test


● The Kruskal-Wallis H test (sometimes also called the "one-way ANOVA on
ranks") is a rank-based nonparametric test that can be used to determine
if there are statistically significant differences between two or more groups
of an independent variable on a continuous or ordinal dependent
variable. It is considered the nonparametric alternative to the one-way
ANOVA, and an extension of the Mann-Whitney U test to allow the
comparison of more than two independent groups.

18 | Page
Formula
where:
H = is the Kruskal-Wallis H test statistic.
N = is the total number of observations
across all groups.
Ri = is the sum of the ranks for group 𝑖
Ni = is the number of observations in group 𝑖

Illustrative Example
● you could use a Kruskal-Wallis H test to understand whether exam
performance, measured on a continuous scale from 0-100, differed based
on test anxiety levels (i.e., your dependent variable would be "exam
performance" and your independent variable would be "test anxiety
level", which has three independent groups: students with "low", "medium"
and "high" test anxiety levels). Alternately, you could use the Kruskal-Wallis
H test to understand whether attitudes towards pay discrimination, where
attitudes are measured on an ordinal scale, differed based on job position
(i.e., your dependent variable would be "attitudes towards pay
discrimination", measured on a 5-point scale from "strongly agree" to
"strongly disagree", and your independent variable would be "job
description", which has three independent groups: "shop floor", "middle
management" and "boardroom").

Practical Application
● Medical Research: In clinical trials or medical studies, researchers may use
the Kruskal-Wallis test to compare the effectiveness of different treatments
or interventions among multiple groups of patients, especially when the
outcome variable is ordinal or not normally distributed.
● Social Sciences: Social scientists often use the Kruskal-Wallis test to analyze
survey data or experimental results involving Likert scale responses or other
ordinal variables across different demographic groups or experimental
conditions.
● Environmental Studies: Environmental scientists might use the Kruskal-Wallis
test to compare pollutant levels or ecological measurements across
different sites or regions.

Tools
Online Tool
1. Kruskal-Wallis Test Calculator

19 | Page
(https://www.socscistatistics.com/tests/kruskal/default.aspx): this
calculator that allows you to analyze the differences between three or
more samples (treatments).
Offline Tool
1. Microsoft Excel or Google sheet: These programs have built-in statistical
functions that can help you calculate the H statistic and p-value.

20 | Page
Additional Questions:
1. What is a hypothesis?
● A hypothesis is an idea or explanation that is being proposed prior to the study or
experiment that will be conducted. It is conceptualized based on the limited
evidence and will serve as the starting point for further research and study. For
example, “the clouds are dark, for sure it will rain” is a hypothesis because it
suggests that the appearance of clouds has an effect on the occurrence of
rainfall.

2. What is the difference between a null hypothesis and an alternative hypothesis?


● In hypothesis testing, two contrasting statements are used: the alternative
hypothesis and the null hypothesis. The null hypothesis, which stands for the status
quo or default assumption, indicates that there may not be any meaningful
relationship, effect, or difference in the population parameters under
investigation. It frequently asserts that sampling variability or random chance are
to blame for any relationship or difference in the sample data that is noticed. On
the other hand, the alternative hypothesis suggests a particular claim or
assertion, implying that the population parameters differ, have a significant
effect, or have a relationship. It acts as a counterargument to the null hypothesis
and outlines the researcher's desired outcomes for the hypothesis test. Essentially,
the alternative hypothesis expresses the researcher's hypothesis or research
question, while the null hypothesis stands for skepticism. The research question
and the intended result of the hypothesis test determine which of the null and
alternative hypotheses to use.

3. How to formulate a null hypothesis?


● To formulate a null hypothesis, start by asking a question. Then rephrase this
question under the assumption that no relationship exists between the variables
being studied. In essence, assume that any treatment being investigated has no
effect. Lastly, write your hypothesis to reflect this assumption.

4. What statistical tools are useful to determine if the given data have significant
differences?
● A Paired T-Test is one of the ideal tests to determine the significant correlation or
differences of two sets of data. It uses the mean and standard deviations of two
sets to determine if both sets of data to determine the significant differences of
both data. This process helps the researcher to determine the correlations of the
data given to them, this helps the researcher to determine if the data is
significant or not, this can also help the researcher to compare not only present
data but also the result of both past and future research.

21 | Page
5. What statistical tools are useful to determine if the given data have significant
relationships?
● There are many statistical tools that can be used in determining the significance
of two or more data, researchers can use, independent sample t-test, paired
t-test, Wilson-Signed rank test, Mann-Whitney u test, Spearman correlation, and
Kruskal -Wallis H Test. This sample of correlation determining test are can be used
for different reasons, some of this are for parametric test and the other are
non-parametric test, Determining the significant correlation of two or more data
are important in researches, it helps the researcher to determine the progress of
their researcher and some significant differences of research results.

6. What is the difference between parametric tests and nonparametric tests? When to
use them?
● Parametric and nonparametric tests are the two different ways that hypotheses
can be tested in statistics. Most of the differences can be explained by the
assumptions made about the underlying population distribution and the types of
data on which the tests are conducted. For example, parametric tests make
assumptions about the population distribution and presume that the data have
a normal distribution. These tests yield conclusions on the population parameters,
especially the mean. Typical parametric tests include t-tests, ANOVA, and linear
regression. It is based on concepts like population distribution normality and
homogeneity of variance. Usually used when data satisfies certain specifications,
such as being normally distributed and continuous. Variables like variation and
population mean are involved. Nonparametric tests, on the other hand, are
hypothesis tests that do not depend on particular assumptions about the
population distribution. When working with ordinal or non-normally distributed
data, for example, where the assumptions of parametric tests are not met, these
tests are frequently used. Examples include the Wilson signed-rank test, the
Mann-Whitney U test, and the Kruskal-Wallis H test. In summary, parametric tests
make certain assumptions about the population distribution and are suitable for
continuous data with specific properties. On the other hand, nonparametric tests
can be used for a wider range of data types and circumstances due to their
increased flexibility and lack of dependence on distribution.

22 | Page
7. When should a researcher decide to reject the null hypothesis? Specify.
● Researchers frequently use the p-value, which calculates the probability of
witnessing the data under the null hypothesis's validity, to decide whether to
reject the null hypothesis. Strong evidence against the null hypothesis is indicated
when the p-value is less than the selected significance level, which is usually set
at 0.05. The first step in utilizing a p-value to reject the null hypothesis is to specify
the alternative and null hypotheses, which are declarations that are mutually
exclusive and serve as guidelines for hypothesis testing. The next step is to
compute the test statistic depending on the statistical test that is being run, using
the relevant formulas. Calculate the p-value as well, which indicates the
likelihood of seeing the data in the event that the null hypothesis is correct. Next,
establish the threshold for rejecting the null hypothesis by setting the significance
level (α). Lastly, compare the computed p-value to α. if it is less than or equal to
α, the null hypothesis should be rejected. If not, it should not be. Researchers
may also take into account the test statistic, contrasting its absolute value with
crucial values related to the selected significance level. The null hypothesis is
supported in rejecting it if the test statistic is greater than the crucial value.
Researchers use a few measures when rejecting the null hypothesis with a critical
value. The first step is to describe the null and alternative hypotheses, which
stand for what is expected and what is not expected. The second one is to
evaluate the evidence in favor of the null hypothesis by computing the test
statistic using sample data. The third step is to ascertain the critical value, which
denotes the point at which the null hypothesis may be ruled out. In order to
determine whether to reject the null hypothesis, compare the test statistic to the
critical value while taking the type of hypothesis test (upper tail, lower tail, or
two-tailed) into account. Both techniques assist researchers in evaluating the
degree of evidence supporting alternate hypotheses or significant correlations
between study variables.

8. When should a researcher decide that he fails to reject the null hypothesis?
● A researcher should decide to fail to reject the null hypothesis when the data
does not provide strong enough evidence to support the alternative hypothesis.
This is typically determined by comparing the p-value (the probability of
observing the data if the null hypothesis is true) to a predetermined significance
level (commonly set at 0.05). If the p-value is greater than the significance level,
the data does not suggest a statistically significant effect, and the null hypothesis

23 | Page
cannot be rejected. This means the researcher concludes there is insufficient
evidence to support the alternative hypothesis.

9. Should a researcher accept the null hypothesis? Why or why not?


● No, a researcher should not accept the null hypothesis because failing to reject
it does not mean it is true. The result might be due to insufficient evidence or a
lack of sensitivity in the test, so it doesn't confirm the null hypothesis. Instead, the
researcher should say there is not enough evidence to reject it and consider the
possibility of type II error (failing to detect a true effect).

24 | Page
Additional Tasks:
Task A
Table 1: Temperature of weather stations from the three major islands of Philippines

Weather Station Temperature Data (°C)

April 27 April 28 April 29 April 30 May 1

Science Garden
Quezon City, Metro 44 44 42 42 42
Manila

Baguio City, 28 28 26 28 28
Benguet

Iba, Zambales 44 53 44 43 43

Puerto Princesa 43 44 43 46 45
City, Palawan

Mambusao, Capiz 42 44 42 42 42

Panglao 41 41 39 39 41
International Airport,
Bohol

Catarman, Northern 44 43 43 44 43
Samar

VSU-Baybay, Leyte 40 38 37 40 40

Dipolog, 44 42 41 42 41
Zamboanga Del
Norte

Malaybalay, 37 38 34 35 35
Bukidnon

Davao City, Davao 43 42 39 36 38


Del Sur

Surigao City, 41 30 37 38 36
Surigao Del Norte

25 | Page
1. Hypotheses
● Null Hypothesis: There is no significant difference in the average
temperatures among the weather stations across the Philippine islands
over the past 5 days.
● Alternative Hypothesis: There is a significant difference in the average
temperatures among the weather stations across the Philippine islands
over the past 5 days.

2. Statistical Tool
● One-way ANOVA (Analysis of Variance) is a suitable statistical tool for
comparing the means of three or more groups, such as the temperature
data from different weather stations. It assesses whether there are any
significant differences between the means of these groups. In this case,
they have temperature data from at least 12 weather stations across the
main Philippine island groups, making ANOVA appropriate for analyzing
the variation in temperatures among these groups. Additionally, ANOVA
provides information on the overall variability that makes this statistical tool
is useful for determining if there is a significant difference in temperature
data among the weather stations.

Table. Summary

Date Count Sum Average Variance

April 27 12 491 40.91667 20.99242

April 28 12 487 40.58333 43.90152

April 29 12 467 38.91667 25.53788

April 30 12 475 39.58333 23.71970

May 1 12 475 39.50000 21.72727

Table. ANOVA

Source of SS df MS F P-value F crit


Variance

Between 32.73333 4 8.183333 0.301126 0.875964 2.539689


Groups

Within Groups 1493.667 55 27.17576

26 | Page
Total 1527.4 59

3. Interpretation
● Based on the results of the statistical analysis One-Way ANOVA, we fail to
reject the null hypothesis (H0). There is insufficient evidence to conclude
that there is a significant difference (or effect) between the groups (or
conditions, variables, etc.). Therefore, we accept the null hypothesis and
conclude that there is no significant difference in the average
temperatures among the weather stations across the Philippine islands
over the past 5 days.

PREPARED BY:
BSE SCIENCE - 3B

27 | Page

You might also like