0% found this document useful (0 votes)

12 views20 pages

Module 2 Statistical Analysis For Booklet

Module 2 statistical analysis for booklet

Uploaded by

Lance Cake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views20 pages

Module 2 Statistical Analysis For Booklet

Module 2 statistical analysis for booklet

Uploaded by

Lance Cake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

RESEARCH PRACTICUM

Quarter 3 – Module 2
STATISTICAL ANALYSIS
Statistics is basically a science that involves data collection, data interpretation and finally, data
validation. Statistical data analysis is a procedure of performing various statistical operations. It is a kind of
quantitative research, which seeks to quantify the data, and typically, applies some form of statistical
analysis. Quantitative data basically involves descriptive data, such as survey data and observational data.

.In this module, you will be guided on how to analyze and interpret the data presented.

Most Essential Learning Competency (MELC)

Apply the guidelines on how analyze and interpret data in an investigatory

project.

After going through this module, you are expected to attain the following objectives:

➢ Define statistical data analysis

➢ Differentiate parametric and non-parametric test on testing hypothesis
➢ Discuss the guidelines on how to analyze data and interpret the results using statistical
treatments.
➢ Apply data analysis and interpretation in your investigatory project.

Are you now ready to learn about statistical data analysis and interpretation? Good! Have fun learning!

What I Know

Multiple Choice. Choose the letter of the best answer. Write the chosen letter on a separate sheet of paper.
1. What is the purpose of inferential statistics?
A. It provides a measure of central tendency and a measure of variability for the data.
B. It determines whether a correlation coefficient is positive, negative, or zero.
C. It determines whether a statistical result is significant.
D. It determines whether to accept or reject the null hypothesis.
2. A clinical trial was designed to investigate the effectiveness of a new drug to reduce symptoms of asthma
in children. A total of n=10 participants are randomized to receive either the new drug or a placebo.
1
Participants are asked to record the number of episodes of shortness of breath over a 1 week period
following receipt of the assigned treatment. The data are shown below.
Day 1 Day 2 Day 3 Day 4 Day 5
Placebo 7 5 6 4 12
New Drug 3 6 4 2 1

Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants
receiving the new drug as compared to those receiving the placebo? How would you classify the data used
in this test?
A. ordinal data B. ratio data C. nominal data D. interval data
3. When to use parametric or nonparametric tests?
A. Parametric tests are used in a normal distribution while nonparametric tests are used
in a non-normal distribution.
. B. Parametric tests are used in a nominal data while nonparametric tests are used in an
interval data.
C. Parametric tests are used in a non-normal distribution while nonparametric tests are
used in a normal distribution.
A. Parametric tests are used when the data is ordinal while nonparametric tests are used
when the data is ratio.
4. In hypothesis testing, how do you compare the two types of errors ?
A. Type I error occurs one rejects a null hypothesis when it is true while Type II occurs
one fails to reject a null hypothesis that is false.
B. Type I error occurs one fails to rejects a null hypothesis when it is true while Type II
occurs one reject a null hypothesis that is false
C The probability of committing a Type I error is called the Beta while the probability of
committing a Type II error is called Alpha
D. The probability of committing a Type I error is called the power of test while t he
probability of committing a Type II error is called significance level.
5. Which of the following is NOT TRUE about the following parametric tests ?
A. T-test is a one sample mean test with an unknown population standard deviation
B. Z-test is a one sample mean test with known population standard deviation
C. T -test is used for comparing the means of two populations with large sample.
D. Z-test is used for testing the mean of a population against a standard.
6. A study was set up to look at whether there was a difference in the mean arterial blood pressure between
two groups of volunteers, after 6 weeks of following one of two treatment programs. One group of
volunteers was given an exercise regimen to follow for the 6 weeks and the other group were given the
same exercise regimen with the addition of an experimental tablet. Which type of t-test should be used in
this situation?
A. One sample t-test C. Paired samples t-test
B. Independent samples t-test D. None of the t-tests would be suitable
7. ANOVA procedure is used to compare the means of the comparison groups. What does a researcher
wanted to know when using ANOVA?
A. If there is a significant difference bet. and among columns and rows.
B. If there is no significant differences bet. and among column and rows
C. If the columns and rows are equal.
D. When columns are greater than rows.
8. Which of the following do you think is the most useful attribute of ANOVA tests?
A. It effectively compares two or more variances.
B. It effectively compares three or more means.
C. It effectively compares two or more means.
D. It effectively compares three or more variances.
2
9. What will be the decision if the computed F-value is greater than the critical value?
A. Accept Ho C. significant
B. Reject Ho D. not significant
10. What will you propose to get the right decision in one-way ANOVA F-test?
A. When we accept the null hypothesis, we should know which treatments can be said to
be significantly different from the others.
B. When we reject the null hypothesis, we should know which treatments can be said to
be significantly different from the others.
C. When we accept the null hypothesis, we should know which treatments can be said to
equal the others.
D. When we accept the alternative hypothesis, we should know which treatments can be
said to equal the others.
11. Which of the following statements about the Chi-Square test is INCORRECT?
A. It is widely used as a non-parametric test for nominally scaled data
B. It is used only when the population is viewed as only two classes, e.g. Male - female.
C. It tests for significant differences between the observed and expected distributions
D. Can also be used with k independent samples.
12. When do we say that two variables are correlated?
A. If a change in one variable changes the other variable in the same direction.
B. If change in one variable changes the other variable in the reverse direction.
C. If change in one variable changes the other variable, either in the same or reverse
direction.
D. If change in one variable does not change the other variable.

13. Which of the following shows a strong direct relationship between variables x and y?
A. Correlation value greater than 0 C. Correlation value is equal to zero
B. Correlation value is closer to +1 D. Correlation value is closer to -1

14. When presenting the results of a statistical analysis of the data gathered, what should
NOT be done?
A. Data are put in a the sources of variance, sum of squares, mean squares, F ratios and
probability values
B. Indicate which differences in treatment means were statistically significant but not
describe what the means actually were.
C. State in sentence form what comparison is being evaluated, the value obtained for the
statistic, and the level of significance achieved.
D. Place all the statistical computations made to come up with the results.
15. Why is the discussion section important in research paper?
A. It justifies the purpose of the study
B. It explains why the results of the study don't really mean anything.
C. It lists all the things that went wrong as one attempted to carry out the study.
D. It interprets the results and indicates how they fit with previous research and theory.

Statistical Analysis

Data need to be analyzed and discussed to gain significance. It involves explanation of observed
trend or patterns and other forms of interrelationship between variables.

3
What’s In

In Module 1, you learned to present data of the findings of your experimentations using tables and
graphs. Guidelines should be followed in order to have a correct data presentations.
Before we proceed to the data analysis using statistical treatments, let’s have a review first by
answering activity below.
Activity 1. Know About Data Presentation
Direction: Write FACT if the statement is true and BLUFF if the statement is false.
_____1. A table a way of presenting data in a report.
_____2. The table number is placed above the table, in bold text and flush with the left
margin
_____3. Histograms are used to represent percentages and proportions.
_____4. Any images used within text are called figures
_____5. The legend explains the symbols, abbreviations, and terminology used in the figure.
_____6. Specific notes explain units of measurement, symbols, and abbreviations, or provide
citation information
_____7. Tables have vertical lines at its lateral borders
_____8. All tables are numbered sequentially as you refer to them in the text
_____9. The body of the table includes all the reported information organized in cells
_____10. Graphs are good at quickly conveying relationships like comparison and
distribution

What’s New

After experimentation was done, the data are recorded and tabulated in tables for further analysis
using statistics.
Let’s familiarize with the different descriptive and inferential statistics by doing the following
activity.

Activity 2. Treat Them Statistically

Objectives: At the end of the activity, you should be able to analyze and interpret results in a given table
using statistical treatments.
Direction: Read and analyze the two tables with tabulated data of statistical treatments. Answer the questions
below.

Table 1 shows the differences among the three noise conditions (Low, Moderate and loud) on the ability of
subjects to concentrate.

Table 1.
ANOVA Results of Three Noise Conditions

Sources of Degrees of MS Computed Tabular Value

variation freedom Value
Between groups 2 25.9 12.95 6.29 3.59

4
Within groups 17 34.95 2.06
Total 19 60.85

1. State the null and alternative hypothesis.

_______________________________________________________________________________________

2. Give your interpretation.

_______________________________________________________________________________________

Table 2 shows the difference in the accuracy and effectiveness between YSI Dissoved Oxygen meter and
the ReTi Mot using T-Test fo Independent Samples.
Table 2
Significant Difference in the Accuracy and eEffectiveness Between YSI DO meter and
ReTiMot in terms of pH value
Device Sample Size Mean Standard Deviation

YSI DO meter 6 8.05 0.03

ReTiMot 6 8.05 0.03

!. State the null and alternative hypothesis.

2. Solve for the value of t using the formula t = X - Y
(sdx)2 + (sdy)2
n-1 n-1
3. Interpret the result. Use the t critical = 2.751 at df = 5 and ἀ = 0.05

What is It

Statistical data analysis generally involves some form of statistical tools, which a layman cannot
perform without having any statistical knowledge. There are various software packages to perform statistical
data analysis. This software includes Statistical Analysis System (SAS), Statistical Package for the Social
Sciences (SPSS), Stat soft, etc.

Data in statistical data analysis consists of variable(s). Sometimes the data is univariate or
multivariate. Depending upon the number of variables, the researcher performs different statistical
techniques.

If the data in statistical data analysis is multiple in numbers, then several multivariate can be
performed. These are factor statistical data analysis, discriminant statistical data analysis, etc. Similarly, if
the data is singular in number, then the univariate statistical data analysis is performed. This includes t test
for significance, z test, f test, ANOVA one way, etc.
The data in statistical data analysis is basically of 2 types, namely, continuous data and discreet data.
The continuous data is the one that cannot be counted. For example, intensity of a light can be measured but
cannot be counted. The discreet data is the one that can be counted. For example, the number of bulbs can
be counted.
These distributions in statistical data analysis help us to understand which data falls under which
distribution. If the data is about the intensity of a bulb, then the data would be falling in Poisson distribution.
5
There is a major task in statistical data analysis, which comprises of statistical inference. The
statistical inference is mainly comprised of two parts: estimation and tests of hypothesis.
Estimation in statistical data analysis mainly involves parametric data—the data that consists of
parameters. On the other hand, tests of hypothesis in statistical data analysis mainly involve non
parametric data— the data that consists of no parameters.

STATISTICAL ANALYSIS
1. Statistical Mean
• is the central point of the data that’s being processed.
• The result is referred to as the mean of the data provided. In real life, people typically use mean to
in regards to research, academics, and sports
• To find the mean of your data, you would first add the numbers together, and then divide the sum
by how many numbers are within the dataset or list.
As an example, to find the mean of 6, 18, and 24, you would first add them together.
6 + 18 + 24 = 48
Then, divide by how many numbers in the list (3). 48 / 3 = 16. The mean is 16.

2. Standard Deviation
• is a method of statistical analysis that measures the spread of data around the mean.
• When you’re dealing with a high standard deviation, this points to data that’s spread widely from
the mean. Similarly, a low deviation shows that most data is in line with the mean and can also be
called the expected value of a set.
• Standard deviation is mainly used when you need to determine the dispersion of data points
(whether or not they’re clustered).

How to find the standard deviation

The formula to calculate the standard deviation is: σ2 = Σ(x − μ)2/n
In this formula:
• The symbol for standard deviation is σ
• Σ stands for the sum of the data
• x stands for the value of the dataset
• μ stands for the mean of the data
• σ2 stands for the variance
• n stands for the number of data points in the population
To find the standard deviation:
• Find the mean of the numbers within the data set
• For each number within the data set, subtract the mean and square the result (which is this part of
the formula (x − μ)2).
• Find the mean of those squared differences
• Take the square root of the final answer
If you used the same three numbers in our mean example, 6, 18, and 24, the standard deviation, or σ,
would be 7.4833147735479.

3. Regression
When it comes to statistics, regression is the relationship between a dependent variable (the data
you’re looking to measure) and an independent variable (the data used to predict the dependent variable).

6
• It can also be explained by how one variable affects another, or changes in a variable that trigger
changes in another, essentially cause and effect. It implies that the outcome is dependent on one or
more variables.
• The line used in regression analysis graphs and charts signify whether the relationships between the
variables are strong or weak, in addition to showing trends over a specific amount of time.

Regression formula
The regression formula that’s used to see how data could look in the future is:
Y = a + b(x)
In this formula:
• a refers to the y-intercept, the value of y when x = 0
• X is the dependent variable
• Y is the independent variable
• B refers to the slope, or rise over run

HYPOTHESIS TESTING
Statistical hypothesis testing
- is used to determine whether an experiment conducted provides enough evidence to reject a
proposition.
- It is also used to remove the chance process in an experiment and establish its validity and
relationship with the event under consideration.

Statistical Hypothesis
They are hypotheses stated in such a way that they may be evaluated by appropriate statistical
techniques.
There are two types of hypothesis testing:
1. Null hypothesis (H0) – It is the hypothesis which assumes that there is no difference between two values.
H0 : μ 1 = μ2.
2. Alternative hypothesis (HA) – It is the hypothesis that differs from null hypothesis.
H0 : μ1 ≠ μ2 or μ1 > μ2 or r μ1 < μ2.

Hypothesis Errors
1. Type – I Error
It is the probability of finding the difference; when no such difference actually exist. Acceptance of
inactive compound. It is also known as alpha (α ) error / false positive.

2. Type –II Error

It is the probability of inability to detect difference when such difference actually exixts, thus
resulting in rejection of active compounds as an inactive. It is called beta (β) error/ false negative.

Level of Significance
• The probability of committing type I error .
• Denoted by α.
• Level of significance of .05% means – risk of making wrong decisions only is 5 out of 100 cases ,
ie. 95% confident

Steps in the Test of Significance

7
Power of the test
• It is the probability of committing type II error.
• Denoted by β and 1- β
• Power of the test is the probability of rejecting H0 when H0 is false, i.e. correct decision.

ρ-Value
• It is defined as the smallest value of α for which the null hypothesis can be rejected.
• If the p-value is less than α , we reject the null hypothesis ( p< α)
• If the p-value is greater than α , we do not reject the null hypothesis ( p ≥α)

Critical Region
1. One tailed test
• The rejection is in one other tail of distribution.
• The difference could be only their in one direction/ possibility
Example: English men are taller than Indian men.

2. Two Tailed Test

• The rejection is split between two sides or tails of
distribution
• The difference could be in both direction/possibility
Ex: Comaprative study of drug ‘X’ with ethanol for antihypertensive property

Sample Size:
• Large Sample - sample of size is more than 30
• Small Sample – sample of size is less than or equal to 30.
• Many statistical test are based upon the assumption that the data are sampled from a Gaussian
distribution.
• Procedures for testing hypotheses about parameters in a population described by a specified
distribution form, (normal distribution) are called PARAMETRIC TESTS.

TYPES OF INFERENTIAL STATISTICS

1. PARAMETRIC STATICS estimate the value of a population parameters from the characteristics of a
sample.
• Assumes the values in a sample are normally distributed.
• Interval/ratio level data are required.
• Comparing the means of the two or more groups relative to the variance within the groups
2. NON PARAMETRIC STATISTICS
• no assumptions about the underlying distribution of the sample
• ordinal/nominal data
• comparing the medians or ranks of teo or more groups

TYPES OF PARAMETRIC TESTS

A. Large sample tests
8
• Z-test
B. Small sample Tests
• t-test
• Independent /unpaired
• Paired test
• ANOVA (Analysis of Variance)
• One way ANOVA
• Two way ANOVA

1. Z-Test
• A z-test is used for testing the mean of a population versus a standard or comparing the means of
two populations, with large (n ≥ 30 ) samples whether you know the population standard deviation
or not.
• It is also used for testing the proportion of some characteristics versus a standard proportion, or
comparing the proportions of two populations.
Examples: Comparing the average engineering salaries of men versus women.
Comparing the fraction detectives from two production lines.

2. T-test
- It is derived by W.S. Gosset in 1908
Properties of t distribution:
• It has mean 0
• It has variance greater then one
• It is bell-shaped by symmetrical distribution about a mean
Assumption for t-test
• Sample must be random, observtions independent
• Standard deviation is not known
• Normal distribution of population
Uses of t-test
• The mean of the sample
• The difference between means or to compare to samples
• Correlation coefficient
Types of T-test
• Paired t-test
• Unparied t-test

3. ANALYSIS OF VARIANCE (ANOVA)

• is a collection of statistical models used to analyze the diffferences between group means and their
associated procedures (such as “variation) among and between groups).
• Compares multiple groups at one time.
• Developed by R.A. Fischer (1935) He. clearly explained the relationship among the mean, the
variance, and the normal distribution: "The normal distribution has only two characteristics, its
mean and its variance. The mean determines the bias of our estimate, and the variance determines
9
its precision." It is generally known that the estimation is more precise as the variance becomes
smaller and smaller
• Two Types: i. One way ANOVA
ii. Two way ANOVA

There are three assumptions Analysis of Variance (ANOVA):

• Observations are independent.
• The sample data have a normal distribution.
• Scores in different groups have homogeneous variances.

One Way ANOVA

• It compares three or more unmatched groups when data are categorized in one way.
Examples: 1. Compare control group with three different doses of aspirin in rats.
2. Effect of supplement of Vit C in each subject before, during and after the
treatment.

Two Way ANOVA

• Used to determine the effect of two nominal predictor variables on a continuous outcome variable.
• A two way ANOVA test analyzes the effect of the indiepnednt variables on the expected outcome
along with their relationship to the outcome itself.

Difference Between One Way and Two Way ANOVA

• An example of a one –way ANOVA
To determine if there is a difference in the mean height of stalks of three different types of
seed.
Since there is more than one mean we can use a one way ANOVA because there is only one factor
that could be making the heights difference.
• An example of a Two Way ANOVA
To determine the mean height pf the stalks using the three different types of seeds and
three different types of fertilizers.
The types of seed could cause the change, the types of fertilizers could cause the change, and /or
there is an interaction between the types of seed and the type of fertilizer.
There are two factors here ( type of seed and type of fertilizer) so it the assumptions hold, then we can ues
a two-way ANOVA.

SCHEEFE TEST
• Is used to find out which pairs of means are significant, after performing the ANOVA and got a
significant F-statistic ( rejected the null hypothesis that the means are the same).
• The critical value for the Scheffe' test is the degrees of freedom for the between variance times the
critical value for the one-way ANOVA.
• This simplifies to be: CV = (k-1) F(k-1,N-k,alpha)

10
How to conduct the Scheefe test?
• In order to conduct the Scheffé test, one must
compare the means two at a time, using all
possible combinations of means.
• For example, if there are three means, the
following comparisons must be done:
X1 versus X2
X1 versus X3
X2 versus X3
• To find the critical value F’ of the Scheffe’s
Test… F’=(k-1)(CV)
Calculating the Scheefe Test
• F’ test is run if the null hypothesis is rejected in an ANOVA test. Otherwise, the means are equal
and so there is no point in running this test.
• The null hypothesis for the test is that all means are the same: H0: μi = μj.
• The alternate hypothesis is that the means are not the same: H0: μi ≠ μj.
• Like most statistical tests, find a critical value and then compare it with a test statistic.
• Reject the null hypothesis if the Scheffe test statistic is greater than the critical value.

Summary of Parametric test applied for Different Type of Data.

SI no. Type of Group Parametric test
1 Comparison of two paired groups Paired T test
2 Comparison of two paired groups Unpaired T-test
3 Comparison of three or more matched groups One way ANOVA
4 Comparison of three or more matched groups Two way ANOVA
5 Comparison of two variables Pearson Correlation

STATISTICAL CORRELATION

Statistical correlation
• is a statistical technique which tells us if two variables are related.
For example, consider the variables family income and family expenditure.
It is well known that income and expenditure increase or decrease together. Thus they are related in
the sense that change in any one variable is accompanied by change in the other variable.
Again price and demand of a commodity are related variables; when price increases demand will tend
to decreases and vice versa.
If the change in one variable is accompanied by a change in the other, then the variables are said to
be correlated. We can therefore say that family income and family expenditure, price and demand are
correlated.

Relationship Between Variables

Correlation can tell you something about the relationship between variables. It is used to
understand: whether the relationship is positive or negative
the strength of relationship.
Correlation is a powerful tool that provides these vital pieces of information.

11
• In the case of family income and family expenditure, it is easy to see that they both rise or fall
together in the same direction. This is called positive correlation.
• In case of price and demand, change occurs in the opposite direction so that increase in one is
accompanied by decrease in the other. This is called negative correlation.

Coefficient of Correlation
Statistical correlation is measured by what is called coefficient of correlation (r). Its numerical
value ranges from +1.0 to -1.0. It gives us an indication of the strength of
relationship.
• In general, r > 0 indicates positive relationship, r < 0 indicates negative relationship while r = 0
indicates no relationship (or that the variables are independent and not related).
• Here r = +1.0 describes a perfect positive correlation and r = -1.0 describes a perfect negative
correlation.
• Closer the coefficients are to +1.0 and -1.0, greater is the strength of the relationship between the
variables.
• As a rule of thumb, the following guidelines on strength of relationship are often useful (though
many experts would somewhat disagree on the choice of boundaries).

Value of r Strength of relationship

-1.0 to -0.5 or 1.0 to 0.5 Strong
-0.5 to -0.3 or 0.3 to 0.5 Moderate
-0.3 to -0.1 or 0.1 to 0.3 Weak
-0.1 to 0.1 None or very weak
Correlation is only appropriate for examining the relationship between meaningful quantifiable data
(e.g. air pressure, temperature) rather than categorical data such as gender, favorite color etc.

Disadvantages
While 'r' (correlation coefficient) is a powerful tool, it has to be handled with care.
• The most used correlation coefficients only measure linear relationship. It is therefore perfectly
possible that while there is strong non linear relationship between the variables, r is close to 0 or
even 0. In such a case, a scatter diagram can roughly indicate the existence or otherwise of a non
linear relationship.
• One has to be careful in interpreting the value of 'r'. For example, one could compute 'r' between the
size of shoe and intelligence of individuals, heights and income. Irrespective of the value of 'r', it
makes no sense and is hence termed chance or non-sense correlation.

Pearson Product-Moment Correlation

- is one of the measures of correlation which quantifies the strength as well as direction of such
relationship. It is usually denoted by Greek letter ρ.
In the study of relationships, two variables are said to be correlated if change in one variable is
accompanied by change in the other - either in the same or reverse direction.

Conditions
This coefficient is used if two conditions are satisfied the variables are in the interval or ratio scale of
measurement a linear relationship between them is suspected

Positive and Negative Correlation

12
The coefficient (ρ) is computed as the ratio of covariance between the variables to the product of
their standard deviations. This formulation is advantageous.
• First, it tells us the direction of relationship. Once the coefficient is computed, ρ > 0 will indicate
positive relationship, ρ < 0 will indicate negative relationship while ρ = 0 indicates non existence of
any relationship.
• Second, it ensures (mathematically) that the numerical value of ρ range from -1.0 to +1.0. This
enables us to get an idea of the strength of relationship - or rather the strength of linear
relationship between the variables. Closer the coefficients are to +1.0 or -1.0, greater is the strength
of the linear relationship.

As a rule of thumb, the following guidelines are often useful (though many experts could somewhat
disagree on the choice of boundaries).

Range of Ρ
Value of ρ Strength of relationship
-1.0 to -0.5 or 1.0 to 0.5 Strong
-0.5 to -0.3 or 0.3 to 0.5 Moderate
-0.3 to -0.1 or 0.1 to 0.3 Weak
-0.1 to 0.1 None or very weak
Properties of Ρ
This measure of correlation has interesting properties, some of which are enunciated below:
• It is independent of the units of measurement.
• It is in fact unit free. For example, ρ between highest day temperature (in Centigrade) and rainfall
per day (in mm) is not expressed either in terms of centigrade or mm.
• It is symmetric. This means that ρ between X and Y is exactly the same as ρ between Y and X.
• Pearson's correlation coefficient is independent of change in origin and scale. Thus ρ between
temperature (in Centigrade) and rainfall (in mm) would numerically be equal to ρ between
temperature (in Fahrenheit) and rainfall (in cm).

Procedure:
1: Make a chart with your data for two variables, labeling the variables (x) and (y), and add three more
columns labeled (xy), (x2), and (y2). A simple data chart might look like this:
Person Age (x) Score (y) (xy) (x2) (y2)
1
2
3
More data would be needed, but only three samples are shown for purposes of example.
2: Complete the chart using basic multiplication of the variable values.
Person Age (x) Score (y) (xy) (x2) (y2)
1 20 30 600 400 900
2 24 20 480 576 400
3 17 27 459 289 729
3: After you have multiplied all the values to complete the chart, add up all of the columns from top to
bottom.
13
Person Age (x) Score (y) (xy) (x^2) (y^2)
1 20 30 600 400 900
2 24 20 480 576 400
3 17 27 459 289 729
Total 61 77 1539 1265 2029

4: Use this formula to find the Pearson correlation coefficient value.

Pearson Correlation Coefficient Formula

5: Once you complete the formula above by plugging in all the correct values, the result is your coefficient
value!
If the value is a negative number, then there is a negative correlation of relationship strength, and if
the value is a positive number, then there is a positive correlation of relationship strength.

Note: The above examples only show data for three people, but the ideal sample size to calculate a
Pearson correlation coefficient should be more than ten people.

STEPS IN THE TEST OF SIGNIFICANCE

1. State the null hypothesis
2. State the alternative hypothesis
3. Set the level of significance associated with the null hypothesis (Type I Error)
4. Select the appropriate test statistics
5. Compute the test statistics value.
6. Determine the critical value needed for rejection of the null hypothesis for the particular statistics.
7. Compare the obtained value to the critical value
8. Make a decision : Accept or reject the null hypothesis.

The type of statistical test you use depend on several factors:

• Number of independent variables
• Number of levels of the independent variables
• Number of dependent variables

14
• Independent vs. dependent samples (between vs. within group design
• Scale measurement of the dependent variables.

TITLE OF IP: Comparative Study on the pH and the Amount of Acetic Acid
in Different Fruit Vinegar
Table 3
Test of Significance on pH and the Amount of Acetic Acid in Different Fruit Vinegar
Using the ANOVA

Table 3 shows that the computed f-value of 525.34 is greater than the tabular value of 5.14 . Therefore
the null hypothesis is rejected. This means that there is significant difference in the amount of NaOH solution
used to titrate 10 ml of each fruit vinegar . This further means that the NaOH solutions differ greatly from
one another.
GUIDELINES IN THE DISCUSSION OF RESULTS
 Processing of data should be done accurately.
 Consider the data that are not consistent with the hypothesis.
 Formulate explanations for apparent exceptions, inconsistencies and discrepancies.
 Look in more at ‘deviant’ results in the light of the trends and overall effects.
 Propose explanation for trends and patterns in experimental observations and relate previous work
on the subject.
 State the implications of your findings for practical applications in industry, agriculture, medicine
nutrition, or engineering.
 Stimulate the reader to further thought about your study and to pursue research related to your
work.
 Use the past tense to state experimental observations and the present tense in making remarks
about data presentation.
Discussion of results is done using both the present and past tense

After knowing the parts and the guideline in presenting data, you are now ready to perform activities
to further understand the lesson.

What’s More
Activity 3. Let’s Calculate and Interpret!

1. T- test for Two Independent Sample Means

15
The Alaminos City Water District made a study of a river community with regard to the amount of
semitreated sewage released into the river. Five commonly selected specimens of river water were drawn at
a location above the town and another five below. The dissolved oxygen (DO) readings in parts per million
(ppm) are as follows:

Water Samples Selected Rivers

Above Town Below Town
1 4.7 5.1
2 5.4 5.3
3 5.0 4.7
4 4.9 5,2
5 5.8 5.6

Test whether the mean DO content below the town is less than the mean DO content aboe the town.
Using a 0.05 level of significance, test the null hypothesis.
Source of df Sum of Mean Square F- value
1. H0:
Variation Squares computed Tabular
between group 2 8510.88 3655.44 525.34 5.14
within group 6 41.01 8.002
TOTAL 8 7562.89
H A:
2. Level of significance _________ N = ____________
3. Test statistics to be used:
4. T- Critical value _________
5. Decision:
2. T-test Dependent samples/Matched-Pairs
Foton Tire Company tested the wearing qualities of two types of tires (A and B) it manufactures. For
comparison, 7 samples of Tire A and 7 samples of the B were randomly assigned and mounted on each of
the rear wheels of seven automobiles. The automobiles were allowed to travel for a specific number of miles.
The amount of wear of each tire is recorded. The following data were obtained:

Automobile 1 2 3 4 5 6 7
Tire A 8.9 12.2 9.6 11.8 9.8 10.5 8.8
Tire B 9.2 10.5 11.2 10.5 9.7 11.7 8.5

Use 0.05 level of significance to test the significance of the difference in the average wear of the two
tires.

1. H0: H A:
2. Level of significance _________ N = ____________
3. Test statistics to be used:
4. T- Critical value _________
5. Decision:
3. One way ANOVA

16
To determine if there is a significant difference on the average rice yield using the four types of
fertilizers, an experiment was conducted and the result of the study is shown below. Using a 0.05 level of
significance, test the hypothesis.

Plot Fertilizer
A B C D
1 22 15 20 18
2 25 18 24 21
3 28 16 23 16
4 32 15 19 22
5 26 19 17 18

1. Ho
2.Ha
3. level of significance
4. Test statistics

3. Pearson Product-Moment Corrrelation

Table 1 shows the growth a cell culture (as optical density) was measured at different pH levels.
Table 1
pH and Optical Density of Cell

pH Optical Density
3 0.1
4 0.2
4.5 0.25
5 0.32
5.5 0.33
6 0.35
6.5 0.47
7 0.49
7.5 0.53

a. Compute the correlation coefficient, r.

b. State the relationship between the growth of cell with the pH level.

What I Can Do

PERFORMANCE TASK
Direction: Show the tables and statistical analysis with interpretation of your result and discussion part of
your investigatory project.

Your output will be evaluated using the following criteria:

Completeness of the methodology ……………….. 40%
Technical Writing Skills ……………….. 40%

17
Scientific Thought ……………….. 20%

Assessment

Multiple Choice. Choose the letter of the best answer. Write the chosen letter on a separate sheet of paper.
1. Which of the following statements about the Chi-Square test is INCORRECT?
A. It is widely used as a non-parametric test for nominally scaled data
B. It is used only when the population is viewed as only two classes, e.g. Male - female.
C. It tests for significant differences between the observed and expected distributions
D. Can also be used with k independent samples.
2. When do we say that two variables are correlated?
A. If a change in one variable changes the other variable in the same direction.
B. If change in one variable changes the other variable in the reverse direction.
C. If change in one variable changes the other variable, either in the same or reverse
direction.
D. If change in one variable does not change the other variable.

3. Which of the following shows a strong direct relationship between variables x and y?
A. Correlation value greater than 0 C. Correlation value is equal to zero
B. Correlation value is closer to +1 D. Correlation value is closer to -1

4. In hypothesis testing, how do you compare the two types of errors?

A. Type I error occurs one rejects a null hypothesis when it is true while Type II occurs one
fails to reject a null hypothesis that is false.
B. Type I error occurs one fails to rejects a null hypothesis when it is true while Type II
occurs one reject a null hypothesis that is false
C. The probability of committing a Type I error is called the Beta while t he probability of
committing a Type II error is called Alpha
D. The probability of committing a Type I error is called the power of test while t he
probability of committing a Type II error is called significance level.
5. Which of the following is NOT TRUE about the following parametric tests ?
A. T-test is a one sample mean test with an unknown population standard deviation
B. Z-test is a one sample mean test with known population standard deviation
C. T -test is used for comparing the means of two populations with large sample.
D. Z-test is used for testing the mean of a population against a standard.
6. When presenting the results of a statistical analysis of the data gathered, what should
NOT be done?
A. Data are put in a the sources of variance, sum of squares, mean squares, F ratios and
probability values
B. Indicate which differences in treatment means were statistically significant but not
describe what the means actually were.
C. State in sentence form what comparison is being evaluated, the value obtained for the
statistic, and the level of significance achieved.
D. Place all the statistical computations made to come up with the results.
7. Why is the discussion section important in research paper?
A. It justifies the purpose of the study
b. It explains why the results of the study don't really mean anything.
c. It lists all the things that went wrong as one attempted to carry out the study.

18
d. It interprets the results and indicates how they fit with previous research and theory
8. What is the purpose of inferential statistics?
A. It provides a measure of central tendency and a measure of variability for the data.
B. It determines whether a correlation coefficient is positive, negative, or zero.
C. It determines whether a statistical result is significant.
D. It determines whether to accept or reject the null hypothesis.
9. A clinical trial was designed to investigate the effectiveness of a new drug to reduce symptoms of asthma
in children. A total of n=10 participants are randomized to receive either the new drug or a placebo.
Participants are asked to record the number of episodes of shortness of breath over a 1 week period following
receipt of the assigned treatment. The data are shown below.
Day 1 Day 2 Day 3 Day 4 Day 5
Placebo 7 5 6 4 12
New Drug 3 6 4 2 1

Nursing Research and Statistics Multiple Choice Questions
100% (2)
Nursing Research and Statistics Multiple Choice Questions
14 pages
6102-Data Analysis and Decision Tools
No ratings yet
6102-Data Analysis and Decision Tools
8 pages
Multple Choice Questions INFERENTIAL STAT
100% (10)
Multple Choice Questions INFERENTIAL STAT
7 pages
Stat Test
75% (4)
Stat Test
3 pages
Chi Square Test
78% (9)
Chi Square Test
49 pages
Reviewer-With-Answers pr2 The
No ratings yet
Reviewer-With-Answers pr2 The
26 pages
Reviewer in QM
No ratings yet
Reviewer in QM
17 pages
BADM 299 Exam 3 Chaps 009 & 011-Review Questions
No ratings yet
BADM 299 Exam 3 Chaps 009 & 011-Review Questions
6 pages
Exam Research 2 Students
No ratings yet
Exam Research 2 Students
7 pages
Single
No ratings yet
Single
11 pages
Reviewer in Pr2
No ratings yet
Reviewer in Pr2
9 pages
Diagnostic Test in Practical Research 2
No ratings yet
Diagnostic Test in Practical Research 2
7 pages
4TH Quarter Periodic Exam Part 1 2 3
No ratings yet
4TH Quarter Periodic Exam Part 1 2 3
7 pages
Altfragen Biostatistics Colloquium
100% (1)
Altfragen Biostatistics Colloquium
10 pages
Mit Cat 2
No ratings yet
Mit Cat 2
10 pages
Research-II Q3 M1
No ratings yet
Research-II Q3 M1
26 pages
LAS 6 PR2 Quarter 2 Week 4 1
No ratings yet
LAS 6 PR2 Quarter 2 Week 4 1
4 pages
Statistics Questionnaire II
No ratings yet
Statistics Questionnaire II
4 pages
ECE069
No ratings yet
ECE069
2 pages
Second Quarter Examination in Practical Research 2 - Grade 12
No ratings yet
Second Quarter Examination in Practical Research 2 - Grade 12
3 pages
LABORATORY WORK NO 2 June 29 2023
No ratings yet
LABORATORY WORK NO 2 June 29 2023
6 pages
1 Statistical Test and Their Issues I
No ratings yet
1 Statistical Test and Their Issues I
5 pages
Finals Stat
No ratings yet
Finals Stat
6 pages
WRKSHT 3
No ratings yet
WRKSHT 3
4 pages
2018 Medical Statistics Paper A
No ratings yet
2018 Medical Statistics Paper A
8 pages
3i REVIEWER
No ratings yet
3i REVIEWER
5 pages
Stats Final Exam
No ratings yet
Stats Final Exam
7 pages
DAY 3 Concepts Hypothesis Stats Waste
No ratings yet
DAY 3 Concepts Hypothesis Stats Waste
24 pages
Statistics Test PDF
100% (1)
Statistics Test PDF
8 pages
Healthcare Stats Final Exam 2
No ratings yet
Healthcare Stats Final Exam 2
17 pages
Final Stat - Review
No ratings yet
Final Stat - Review
6 pages
Quiz 1
No ratings yet
Quiz 1
7 pages
PR2 Review 2ND Sem
No ratings yet
PR2 Review 2ND Sem
7 pages
MA Economics MCQ
No ratings yet
MA Economics MCQ
13 pages
Biostat Exam For Anaesthesia
No ratings yet
Biostat Exam For Anaesthesia
7 pages
OMBC106 Research Methodology
No ratings yet
OMBC106 Research Methodology
13 pages
Beniga Ma 102 Pre-Test Exam
67% (3)
Beniga Ma 102 Pre-Test Exam
6 pages
Practical Research (First)
No ratings yet
Practical Research (First)
4 pages
Anser Key 100 Marks Test
No ratings yet
Anser Key 100 Marks Test
20 pages
Carepoint Review Center: Practice Test (Social Research/Research Methods)
No ratings yet
Carepoint Review Center: Practice Test (Social Research/Research Methods)
8 pages
Diagnostic Test Research Iii Multiple Choice. Choose The Letter of The Correct Answer. Write Your Answers On A Separate Answer Sheet
No ratings yet
Diagnostic Test Research Iii Multiple Choice. Choose The Letter of The Correct Answer. Write Your Answers On A Separate Answer Sheet
4 pages
PR2 - Prelims
No ratings yet
PR2 - Prelims
3 pages
QNT 351 Final Exam Correct Answers 100%
100% (1)
QNT 351 Final Exam Correct Answers 100%
4 pages
SIR CLEO - Statistics-Exam
No ratings yet
SIR CLEO - Statistics-Exam
11 pages
TQ - Q2 - Practical Research-2
No ratings yet
TQ - Q2 - Practical Research-2
6 pages
Salkind TB Ch02-8ed
No ratings yet
Salkind TB Ch02-8ed
8 pages
3rd Quarter Exam-Research Project
No ratings yet
3rd Quarter Exam-Research Project
5 pages
Mws Gen Nle TXT Bisection
No ratings yet
Mws Gen Nle TXT Bisection
6 pages
Research MCQ
No ratings yet
Research MCQ
37 pages
Final Exam in Stat
No ratings yet
Final Exam in Stat
5 pages
Summative Test Iii Fourth Quarter Answer Key
No ratings yet
Summative Test Iii Fourth Quarter Answer Key
3 pages
HNS B308 Biostatistics Msa Campus
No ratings yet
HNS B308 Biostatistics Msa Campus
7 pages
Long Test-I.i.i
No ratings yet
Long Test-I.i.i
8 pages
Diagnostic Exam
No ratings yet
Diagnostic Exam
3 pages
Statistics and Probability
No ratings yet
Statistics and Probability
4 pages
Assignment 1
No ratings yet
Assignment 1
9 pages
ACM Quizes
No ratings yet
ACM Quizes
44 pages
Module 4 Data Collection and Instrument Design and Module 5 Hypotheses Testing 9 June
No ratings yet
Module 4 Data Collection and Instrument Design and Module 5 Hypotheses Testing 9 June
5 pages
Final Paper ANS KEY OFFICIAL
No ratings yet
Final Paper ANS KEY OFFICIAL
15 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
From Everand
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book Two
P.Y. Cheng
No ratings yet
Statistics Essentials For Dummies
From Everand
Statistics Essentials For Dummies
Deborah J. Rumsey
3.5/5 (26)
Lecture 10 PDF
No ratings yet
Lecture 10 PDF
7 pages
CPK Index Vs PPM Sigma Level
No ratings yet
CPK Index Vs PPM Sigma Level
4 pages
Clustering-Based Undersampling With Random Over Sampling Examples and Support Vector Machine For Imbalanced Classification of Breast Cancer Diagnosis
No ratings yet
Clustering-Based Undersampling With Random Over Sampling Examples and Support Vector Machine For Imbalanced Classification of Breast Cancer Diagnosis
12 pages
SC A232 Exercise c5
No ratings yet
SC A232 Exercise c5
10 pages
Dealing With Missing Data: Key Assumptions and Methods For Applied Analysis
No ratings yet
Dealing With Missing Data: Key Assumptions and Methods For Applied Analysis
20 pages
Multiple Linear Regression and Checking For Collinearity Using SAS
0% (1)
Multiple Linear Regression and Checking For Collinearity Using SAS
18 pages
HW 1
No ratings yet
HW 1
7 pages
Tarea 10nestadistica
No ratings yet
Tarea 10nestadistica
9 pages
F Distribution: DF Denomi Nator DF For Numerator
No ratings yet
F Distribution: DF Denomi Nator DF For Numerator
12 pages
Estadistica No Parametrica de Conover
No ratings yet
Estadistica No Parametrica de Conover
225 pages
Kothari 2005 - Performance Matched Discretionary
No ratings yet
Kothari 2005 - Performance Matched Discretionary
35 pages
Graduate Macro Theory II: Notes On Time Series
No ratings yet
Graduate Macro Theory II: Notes On Time Series
20 pages
Management Science - Final Exam With Answers
No ratings yet
Management Science - Final Exam With Answers
11 pages
Agoncillo College Inc.: Name: Date: Year and Section: Score: Business Statistics Final Examinations
No ratings yet
Agoncillo College Inc.: Name: Date: Year and Section: Score: Business Statistics Final Examinations
3 pages
Quiz Calculation Sheet
No ratings yet
Quiz Calculation Sheet
17 pages
Lesson 06 Abridged M
No ratings yet
Lesson 06 Abridged M
27 pages
Interview Questions - Linear Regression
No ratings yet
Interview Questions - Linear Regression
6 pages
9MA0 31 AL Statistics June 2019 Shadow Paper
No ratings yet
9MA0 31 AL Statistics June 2019 Shadow Paper
6 pages
Backward Elimination and Stepwise Regression
No ratings yet
Backward Elimination and Stepwise Regression
5 pages
Stat 136 Chapter 7 Inference On The Mean Response and Prediction of New Observations
No ratings yet
Stat 136 Chapter 7 Inference On The Mean Response and Prediction of New Observations
16 pages
Final
No ratings yet
Final
145 pages
Advenced Level Descriptive Statistics
100% (1)
Advenced Level Descriptive Statistics
14 pages
Mongare Cynthia Mugure MBA 2017 - Factors Influencing Growth of SMEs in Kenya - Nairobi County
No ratings yet
Mongare Cynthia Mugure MBA 2017 - Factors Influencing Growth of SMEs in Kenya - Nairobi County
88 pages
Uji Validitas Soal
No ratings yet
Uji Validitas Soal
18 pages
OPR201 Midterm Exam
No ratings yet
OPR201 Midterm Exam
3 pages
Forecasting VaR and ES Using Dynamic Conditional Score Models and Skew Student Distribution
No ratings yet
Forecasting VaR and ES Using Dynamic Conditional Score Models and Skew Student Distribution
8 pages
Table 7.1 Quarterly Demand For Tahoe Salt: Year, QTR Period Demand
No ratings yet
Table 7.1 Quarterly Demand For Tahoe Salt: Year, QTR Period Demand
12 pages
Simulation Assignment (Muhammad Harifan 29123421)
No ratings yet
Simulation Assignment (Muhammad Harifan 29123421)
5 pages

Module 2 Statistical Analysis For Booklet

Uploaded by

Module 2 Statistical Analysis For Booklet

Uploaded by

RESEARCH PRACTICUM

Most Essential Learning Competency (MELC)

Apply the guidelines on how analyze and interpret data in an investigatory

➢ Define statistical data analysis

Activity 2. Treat Them Statistically

Sources of Degrees of MS Computed Tabular Value

1. State the null and alternative hypothesis.

2. Give your interpretation.

YSI DO meter 6 8.05 0.03

!. State the null and alternative hypothesis.

How to find the standard deviation

2. Type –II Error

Steps in the Test of Significance

2. Two Tailed Test

TYPES OF INFERENTIAL STATISTICS

TYPES OF PARAMETRIC TESTS

3. ANALYSIS OF VARIANCE (ANOVA)

There are three assumptions Analysis of Variance (ANOVA):

One Way ANOVA

Two Way ANOVA

Difference Between One Way and Two Way ANOVA

Summary of Parametric test applied for Different Type of Data.

Relationship Between Variables

Value of r Strength of relationship

Pearson Product-Moment Correlation

Positive and Negative Correlation

4: Use this formula to find the Pearson correlation coefficient value.

Pearson Correlation Coefficient Formula

STEPS IN THE TEST OF SIGNIFICANCE

The type of statistical test you use depend on several factors:

1. T- test for Two Independent Sample Means

Water Samples Selected Rivers

3. Pearson Product-Moment Corrrelation

a. Compute the correlation coefficient, r.

Your output will be evaluated using the following criteria:

4. In hypothesis testing, how do you compare the two types of errors?

You might also like