ResearchIIQ3Mod3Wk5 6withanswerkey
ResearchIIQ3Mod3Wk5 6withanswerkey
RESEARCH II
QUARTER 3 – MODULE 3
Week 5 and 6
We hope that through this material, you will experience meaningful learning and
gain deep understanding of the relevant competencies. You can do it!
This module was designed and written with you in mind. This module will help you
understand the different data collection methods and tools in your research work.
The scope of this module permits it to be used in many different learning situations.
The language used recognizes the diverse vocabulary level of students. The lessons
are arranged to follow the standard sequence of the course.
Multiple Choice. Before you study the lessons of this module, answer the following
to test your existing knowledge. Write the capital letter of the correct answer on
your answer sheet.
___1. To run a create a frequency table, which of the following should be click
sequentially in the data view of SPSS.
A. Analyze > Descriptive Statistics > Frequencies
B. Analyze > Descriptive Statistics > Descriptives
C. Analyze > Descriptive Statistics > Crosstabs
D. Analyze > Compare Means > Means
___2. What statistical measure is being answered by the question “What is the
"shape" of the distribution?”?
A. Mean C. Skewness
B. Variance D. Standard deviation
___4. In SPSS, frequencies procedure can compute the following statistics for one
or more continuous variables except __________.
A. N valid response C. Variance
B. Differences D. Range
___5. In SPSS frequencies window, what does the “statistics” button contains or
displays when clicked.
A. frequency tables will be printed
B. contains various graphical options
C. contains various descriptive statistics
D. contains options for how to sort and organize the table output
___6.
The figure above displays what type of window in frequencies procedure?
A. Charts window C. Statistics window
B. Format window D. Style window
___7. Which of the following is NOT displayed in the format window of frequencies
procedure?
A. Multiple Variable options C. Order options
B. Organize output options D. Charts
___8. In making charts, which of the following is the appropriate option for
continuous variables?
A. Bar chart C. Histograms
B. Pie chart D. none
___10. Which of the following is the correct entry for using the syntax window in
running descriptive stat for mean, standard deviation and minimum and
maximum of the variables
A. DESCRIPTIVES VARIABLES=/STATISTICS=MEAN STDDEV MIN MAX.
B. DESCRIPTIVES VARIABLES=/FREQ=MEAN STDDEV MIN MAX.
C. FREQUENCY VARIABLES=/STATISTICS=MEAN STDDEV MIN MAX.
D. FREQUENCY VARIABLES=/DESC=MEAN STDDEV MIN MAX.
___11.
Based form the table above, which of the following statement is correct?
A. Writing has the highest average
B. Math has the highest standard deviation of scores
C. Reading has the lowest standard deviation of scores
D. The averages of English and Math scores were very close
___12. The descriptive procedure in SPSS can compute __________ as new variables
in the dataset.
A. differences C. standard deviations
B. correlations D. z scores
___13. In the descriptive window, you can print the variables in the same order
that they are specified when you choose __________.
A. Variable list
B. Alphabetically
C. Ascending Means
D. Descending means
___14. Which SPSS procedure is useful when you want to summarize and
compare differences in descriptive statistics across one or more factors, or
categorical variables?
4
A. Frequency procedure C. Compare Means procedure
B. Descriptive procedure D. Correlation procedure
___15. In frequency procedure format window, which of the following order options
should you click to arrange the rows of the frequency table in decreasing order
with respect to the category values?
A. ascending values C. ascending counts
B. descending values D. descending counts
___16. In the output view of the frequency table, which of the following column
displays the percentage of observations in the category out of the total number
of nonmissing responses?
A. percent C. valid percent
B. frequency D. cumulative percent
___17.
In the frequency table above the value 59.9 in the last column indicates ______.
A. cumulative percentage of freshman, andsophomore
B. cumulative percentage of all grade level
C. frequency of freshman and sophomore
D. percentage of junior and senior
___18. In comparing means, these are the categorical variable(s) that will be used
to subset the dependent variables.
A. Dependent list C. Variable list
B. Independent list D. Data list
___19. Which of the following charts displays the categories on the graph's x-axis,
and either the frequencies or the percentages on the y-axis?
A. Pie chart
B. Bar chart
C. Histogram
D. Scatter plot
___20. In frequencies procedure format window, the frequency tables for all of the
variables will appear first, and all of the graphs for the variables will appear
after when __________ is selected.
A. Compare variables
B. Organize output by variables
C. Suppress tables
D. Chart type
5
Lesson 1 Descriptive Stats
You have learned from the previous modules how to use statistical softwares
especially SPSS for your research work particularly on how to analyze your data.
You have also learned how to compute and interpret your statistical analysis using
t test, analysis of variance and correlations. In this module, you will study how to
run descriptive statistics and compare means.
What’s In
What’s New
Before doing any kind of statistical testing or model building, you should
always examine your data using summary statistics and graphs. This process is
called exploratory data analysis, and it's a crucial part of every research project.
Exploratory data analysis is about "getting to know" your data: which values are
typical, which values are unusual; where is it centered, how spread out is it; what
are its extremes. More importantly, it's an opportunity to identify and correct any
problems in your data that would affect the conclusions you draw from your
analysis.
What is It
7
Most of the statistics in the Central Tendency, Dispersion,
and Distribution groups are valid for continuous variables; the only exception is
the Mode, which very rarely has a useful interpretation for situations involving
continuous variables. Most of these statistics are identical to the ones that can be
obtained with Descriptives, Compare Means, or Explore. One noticeable exception
to this is the Percentile Values group, which is unique to the Frequencies
procedure:
The Quartiles option produces the first, second, and third quartiles (i.e., the
25th, 50th, and 75th percentiles, respectively).
The Cut points for n equal groups option will divide the dataset
into n equally sized groups and report the percentiles. For example, if the user
specifies n=5, then the output will report the 20th, 40th, 60th, and 80th percentiles.
Or, if the user specifies n=10, then the output will report the 10th, 20th, 30th, ...,
90th percentiles.
The Percentiles option allows the user to specify the exact percentiles to
report. The percentiles should be entered as whole numbers.
You can select more than one option in the Percentile Values group. If your
selections request overlapping information, that information will not be printed
twice.
Note: The Values are group midpoints check box should only be selected
when your data values represent the midpoint of a range. For example, this would
be the case if you had coded anyone between the ages of 30 and 39 as 35
(source: IBM SPSS Statistics Information Center). This situation is more often
associated with ordinal categorical variables.
Note that the options in the Chart Values area apply only to bar charts. These
buttons will be greyed out if the radio button for Histograms is selected.
D Format: Opens the Frequencies: Format window, which contains options for how
to sort and organize the table output.
8
The Order by options are not relevant to continuous variables, but the Multiple
Variables options allow for customization of output when two or more continuous
variables are specified.
Compare variables places the descriptive statistics for the numeric variables side-
by-side
Organize output by variables creates separate summary tables for each numeric
variable.
E Display frequency tables: When checked, frequency tables will be printed. (This
box is checked by default.) If this check box is not checked, no frequency tables will
be produced, and the only output will come from supplementary options
from Statistics or Charts. You will want to uncheck this box if using the
Frequencies procedure on a continuous numeric variable. (If this box is left
checked, a frequency table will be produced where each unique number is treated
as its own category. This could lead to a table with 100+ categories, depending on
the number of observations in your dataset.)
9
4. Click Format. In the Multiple Variables area, make sure that Compare
variables is selected. Then click Continue.
5. Uncheck the box for Display frequency tables. When finished, click OK.
Using Syntax
FREQUENCIES VARIABLES=English Reading Math Writing
/FORMAT=NOTABLE
/NTILES=5
/ORDER=ANALYSIS.
OUTPUT
There is only one box, Statistics, that will print to the Output window. This box will
contain the number of valid and missing values for each variable, as well as any
additional statistics we requested (in this case, the quintiles).
Note that, by default, SPSS will determine how many decimal places to use
for the percentiles based on the variable's number of decimal places. For this
screenshot, we have shortened the output to one decimal place for readability.
The "Compare Groups" option we selected told SPSS to put the results for all
four variables in a single table, side-by-side. From this, we can quickly make several
observations about the data:
Writing had the most missing test scores (31), while Reading had the fewest
(10).The Math test had the lowest scores in general. The bottom 20% of students
scored below 59.3; the top 20% scored above 72.1. Contrast with the English test
scores, where the bottom 20% scored below 77.1, and the top 20% scored above
88.2.
10
Additionally, the Descriptives procedure can optionally compute Z scores as new
variables in your dataset.
Descriptives
To run the Descriptives procedure, select Analyze > Descriptive Statistics >
Descriptives.
The Descriptives window lists all of the variables in your dataset in the left
column. To select variables for analysis, click on the variable name to highlight it,
then click on the arrow button to move the variable to the column on the right.
Alternatively, you can double-click on the name of a variable to move it to the
column on the right.
PROBLEM STATEMENT
The sample dataset has test scores (out of 100) on four placement tests: English,
Reading, Math, and Writing. We want to compare the summary statistics of these
four tests so we can determine which tests the students tended to do the best and
the worst on.
OUTPUT
Here we see a side-by-side comparison of the descriptive statistics for the four
numeric variables. This allows us to quickly make the following observations about
the data:
• Some students were missing scores for the English test.
• The maximum scores observed on the English and the Reading tests exceed
100 points, which was supposed to be the maximum possible score. This
could indicate a problem with data entry, or could indicate an issue with the
12
scoring method. Before proceeding with any other data analysis, we would
need to resolve the issues with these measurements.
• The minimum Math score was far lower than the minimum scores for the
other sections of the test.
• The averages of the English and Reading scores were very close.
• Math had the lowest average score of the four sections, but the highest
standard deviation in scores.
13
The Statistics column on the left shows what statistics are available.
Summary statistics available include: mean, number of cases, standard deviation,
median, grouped median, standard error of mean, sum, minimum, maximum,
range, first, last, variance, kurtosis, standard error of kurtosis, skewness, standard
error of skewness, harmonic mean, geometric mean, percent of total sum, and
percent of total N. The Cell Statistics column on the right are the statistics that
will be produced in the output. By default, the mean, number of cases, standard
deviation will be computed. You can add additional statistics by clicking and
dragging them from the Statistics column to the Cell Statistics column. You can
also click and drag items in the Cell Statistics column to change the order they
appear in the output.
The Statistics for First Layer area includes options that will perform one-
way ANOVA and compute linear fit statistics (R, R2, Eta, and Eta Squared),
respectively.
MEANS TABLES=MileMinDur
/CELLS=MEAN COUNT STDDEV MIN MAX.
Output
The Compare Means procedure will report two tables: the Case Processing
Summary, which contain information about the number of valid cases that the
statistics are based on, and the Report table, which contains the descriptive
statistics themselves.
14
The average mile time overall was 8 minutes, 9 seconds, with a standard deviation
of about 2 minutes. The fastest mile time was about 5 minutes; the slowest was
about 14 minutes.
What’s More
Activity 1
Directions. Given the data below (study on the core subjects academic performance
of different curricular program students of Tayug NHS), execute or run a descriptive
stat to COMPARE MEANS using the SPSS software on these variables-gender,
program and grades. Submit your Data Editor and Output View file with
interpretation to your research teacher.
Student Gender Age Program Final Grades
ID No. Filipino English Math Science A. P.
36312 M 15 SPS 96 91 88 93 94
36311 F 16 SSC 91 92 91 96 91
36315 F 16 SPA 94 91 86 94 92
36378 F 15 SPM 97 93 91 89 93
36324 F 14 SPM 92 92 94 88 91
36359 F 15 SSC 93 95 90 98 95
36323 M 15 SSC 94 95 93 94 96
36345 M 16 SPS 92 93 91 90 91
36302 M 15 SPS 90 94 94 86 89
36308 F 16 SPS 91 92 91 92 97
36329 F 17 SPS 93 97 93 91 92
36307 M 16 SSC 96 95 92 93 91
36399 F 16 SSC 94 96 95 91 91
36381 M 16 SPM 93 95 92 89 90
36322 F 15 SPM 94 92 98 90 92
36368 M 15 SPM 97 94 97 91 92
36378 M 17 SPA 95 95 93 94 93
36309 F 14 SPA 95 92 91 93 94
36337 M 16 SPA 95 93 89 97 91
36392 M 15 SPA 96 91 89 92 88
Criteria for Scoring
Accuracy of Execution of the descriptive stat - 15 points
Clear and Concise output view of the stat- 10 points
Interpretation of results- 5 points
Total 30 points
15
Lesson
Frequency Tables
2
What is It
FREQUENCY TABLES
In SPSS, the Frequencies procedure is primarily used to create frequency tables,
bar charts, and pie charts for categorical variables.
Create a Frequency Table in SPSS
In SPSS, the Frequencies procedure can produce summary measures for categorical
variables in the form of frequency tables, bar charts, or pie charts.
To run the Frequencies procedure, click Analyze > Descriptive Statistics >
Frequencies.
16
The vast majority of the descriptive statistics available in the Frequencies:
Statistics window are never appropriate for nominal variables, and are rarely
appropriate for ordinal variables in most situations. There are two exceptions to
this:
• The Mode (which is the most frequent response) has a clear interpretation
when applied to most nominal and ordinal categorical variables.
• The Values are group midpoints option can be applied to certain ordinal
variables that have been coded in such a way that their value takes on the
midpoint of a range. For example, this would be the case if you had measured
subjects' ages and had coded anyone between the ages of 20 and 29 as 25,
or between the 30 and 39 as 35 (source: IBM SPSS Statistics Information
Center).
If your categorical variables are coded numerically, it is very easy to mis-use
measures like the mean and standard deviation. SPSS will compute those statistics
if they are requested, regardless of whether or not they are meaningful. It is up to
the researcher to determine if these measures are appropriate for their data. In
general, you should never use any of these statistics for dichotomous variables or
nominal variables, and should only use these statistics with caution for ordinal
variables.
• Bar chart displays the categories on the graph's x-axis, and either the
frequencies or the percentages on the y-axis
• Pie chart depicts the categories of a variable as "slices" of a circular "pie".
Note that the options in the Chart Values area apply only to bar charts and pie
charts. In particular, these options affect whether the labeling for the pie slices or
the y-axis of the bar chart uses counts or percentages. This setting will greyed out
if Histograms is selected.
D Format: Opens the Frequencies: Format window, which contains options for
how to sort and organize the table output.
17
The Order by options affect only categorical variables:
• Ascending values arranges the rows of the frequency table in increasing
order with respect to the category values: (alphabetically if string, or by
numeric code if numeric)
• Descending values arranges the rows of the frequency table in decreasing
order with respect to the category values.
• Note: If your categorical variable is coded numerically as 0, 1, 2, ...,
sorting by ascending or descending value will arrange the rows with
respect to the numeric code, not with respect to any assigned labels.)
• Ascending counts orders the rows of the frequency table from least frequent
(lowest count) to most frequent (highest count).
• Descending counts orders the rows of the frequency table from most
frequent (highest count) to least frequent (lowest count).
When working with two or more categorical variables, the Multiple
Variables options only affects the order of the output. If Compare variables is
selected, then the frequency tables for all of the variables will appear first, and all
of the graphs for the variables will appear after. If Organize output by variables is
selected, then the frequency table and graph for the first variable will appear
together; then the frequency table and graph for the second variable will appear
together; etc.
E Display frequency tables: When checked, frequency tables will be printed. (This
box is checked by default.) If this check box is not checked, no frequency tables will
be produced, and the only output will come from supplementary options
from Statistics or Charts. For categorical variables, you will usually want to
leave this box checked.
What should I do if I create a frequency table in SPSS and one of the rows is
blank?If you are creating a frequency table using a string variable and notice that
the first row has a blank category label, similar to this example:
18
This particular issue is specific to frequency tables created from string
variables. The blank row represents observations with missing values. SPSS
does not automatically recognize blank (i.e., empty) strings as missing values, so
the blank values appear as one of the "Valid" (i.e., non-missing) categories.
This issue should not be ignored! When missing values are treated as valid
values, it causes the "Valid Percent" columns to be calculated incorrectly. If the
blank values were correctly treated as missing values, the valid, non-missing
sample size for this table would be 314 + 94 = 408 -- not 435! -- and the valid
percent values would change to 314/408 = 76.9% and 94/408 = 23.0%. Depending
on the number of missing values in your sample, the differences could be even more
dramatic.
Using the sample dataset, let's a create a frequency table and a corresponding bar
chart for the class rank (variable Rank), and let's also request the Mode statistic for
this variable.
RUNNING THE PROCEDURE
Using the Frequencies Dialog Window
1. Open the Frequencies window (Analyze > Descriptive Statistics >
Frequencies) and double-click on variable Rank.
2. To request the mode statistic, click Statistics. Check the box next
to Mode, then click Continue.
3. To turn on the bar chart option, click Charts. Select the radio button
for Bar Charts. Then click Continue.
4. When finished, click OK.
Using Syntax
FREQUENCIES VARIABLES=Rank
/STATISTICS=MODE
/BARCHART FREQ
/ORDER=ANALYSIS.
OUTPUT
Two tables appear in the output: Statistics, which reports the number of missing
and nonmissing observations in the dataset, plus any requested statistics; and
the frequency table for variable Rank. The table title for the frequency table is
determined by the variable's label (or the variable name, if a label is not assigned).
Here, the Statistics table shows that there are 406 valid and 29 missing values. It
also shows the Mode statistic: here, the mode value is "1", which is the numeric
code for the category Freshman. Notice that the Mode statistic isn't displaying the
value labels, even though they have been assigned. (For this reason, we
recommend not requesting the mode statistic; instead, determine the mode from
the frequency table.)
19
Notice how the rows are grouped into "Valid" and "Missing" sections. This
grouping allows for easy comparison of missing versus nonmissing observations.
Note that "System" missing responses are observations that use SPSS's default
symbol -- a period (.) -- for indicating missing values. If a user has assigned special
codes for missing values in the Variable View window, those codes would appear
here.
The frequency table contains four columns of summary measures:
• The Frequency column indicates how many observations fell into the given
category.
• The sample contained a total of 435 students. Of those students, 29
did not specify their class rank.
• The Percent column indicates the percentage of observations in that
category out of all observations (both missing and nonmissing). You can
verify the proportions for each group by dividing its count in the "frequency"
column by the value in the last row of the table (435):
• Freshman: 147/435 = 33.8%
• Sophomore: 96/435 = 22.1%
• Junior: 98/435 = 22.5%
• Senior: 65/435 = 14.9%
• Valid Total: 406/435 = 93.3%
• Missing: 29/435 = 6.7%
• The Valid Percent column displays the percentage of observations in that
category out of the total number of nonmissing responses. You can verify the
proportions for each group by dividing its count in the "frequency" column
by the value of "Total" that appears after the last valid category (406):
• Freshman: 147/406 = 36.2%
• Sophomore: 96/406 = 23.6%
• Junior: 98/406 = 24.1%
• Senior: 65/406 = 16.0%
• The Cumulative Percent column is the total percentage of the sample that
has been accounted for up to that row; it can be computed by adding all of
the numbers in the Valid Percent column above the current row:
• Freshman: 36.2% (there are no rows before this one, so the first
cumulative percent is identical to the first valid percent)
• Sophomore: 36.2 + 23.6 = 59.8%
• Junior: 36.2 + 23.6 + 24.1 = 83.9%
• Senior: 32.6 + 23.6 + 24.1 + 16.0 = 100%
The bar chart appears after the tables.
20
Here, we can see that:
• Freshmen comprised the largest group
• There were approximately equal numbers of sophomores and juniors
• Seniors were the smallest group
What’s More
Activity 2
Directions. Given the same data on Activity 1 page 15 (study on the core subjects
academic performance of different curricular program students of Tayug NHS),
execute or run frequency tables using the SPSS software on these categorical
variables- program and final grades. Submit your Data Editor and Output View
file with chosen charts (bar or pie or histogram) with interpretation to your research
teacher.
Based from what you have learned, what are the importance of descriptive
stats and frequency tables in your research study? Write your main learning
insights on your notebook.
__________________________________________________________________________
________________________________________________________________________________
***For additional learning, you can watch the SPSS Video- Descriptive Statistics
and Frequencies in SPSS – SPSS for Beginners on this link
https://www.youtube.com/watch?v=bapuGcjwiLQ
21
Summative Assessment
Written Work (50%) I. Multiple Choice. Direction. Write the capital letter of the
correct answer on your answer sheet.
___1. What statistical measure is being answered by the question “What is the
"center" of the data?”?
A. Mean C. Skewness
B. Variance D. Standard deviation
___2. In SPSS, frequencies procedure can compute the following statistics for one
or more continuous variables except __________.
A. Standard deviation
B. Correlation
C. Variance
D. Range
___3. In reference to the SPSS window above, what does the “statistics” button
contains or displays when clicked.
A. frequency tables will be printed
B. contains various graphical options
C. contains various descriptive statistics
D. contains options for how to sort and organize the table output
___4. Which of the following is NOT displayed in the format window of frequencies
procedure?
A. Multiple Variable options C. Order options
B. Organize output options D. Charts
___5.
The figure above displays what type of window in frequencies procedure?.
A. Charts window C. Statistics window
B. Format window D. Style window
22
___6. In making charts, which of the following is the appropriate option for
continuous variables?
A. Bar chart C. Histograms
B. Pie chart D. none
___8. The descriptive procedure in SPSS can compute __________ as new variables
in the dataset.
A. differences C. standard deviations
B. correlations D. z scores
___9. In the descriptive window, you can print the variables in the same order that
they are specified when you choose __________.
A. Variable list C. Ascending Means
B. Alphabetically D. Descending means
___11. Which of the following is the correct entry for using the syntax window in
running descriptive stat for mean, standard deviation and minimum and
maximum of the variables
A. DESCRIPTIVES VARIABLES=/STATISTICS=MEAN STDDEV MIN MAX.
B. DESCRIPTIVES VARIABLES=/FREQ=MEAN STDDEV MIN MAX.
C. FREQUENCY VARIABLES=/STATISTICS=MEAN STDDEV MIN MAX.
D. FREQUENCY VARIABLES=/DESC=MEAN STDDEV MIN MAX.
___12.
Based form the table above, which of the following statement is correct?
A. The averages of English and Math scores were very close
23
B. Writing has the lowest standard deviation of scores
C. Math has the lowest standard deviation of scores
D. Writing has the highest average
___13. Which SPSS procedure is useful when you want to summarize and
compare differences in descriptive statistics across one or more factors, or
categorical variables?
A. Frequency procedure C. Compare Means procedure
B. Descriptive procedure D. Correlation procedure
___14. In comparing means, these are the categorical variable(s) that will be used
to subset the dependent variables.
A. Dependent list C. Variable list
B. Independent list D. Data list
___15. Which of the following charts displays the categories on the graph's x-axis,
and either the frequencies or the percentages on the y-axis?
A. Pie chart C. Histogram
B. Bar chart D. Scatter plot
___16. In frequency procedure format window, which of the following order options
should you click to arrange the rows of the frequency table in decreasing order
with respect to the category values?
A. ascending values C. ascending counts
B. descending values D. descending counts
___17. In frequencies procedure format window, the frequency tables for all of the
variables will appear first, and all of the graphs for the variables will appear
after when __________ is selected.
A. Compare variables C. Suppress tables
B. Organize output by variables D. Chart type
___18. In the output view of the frequency table, which of the following column
displays the percentage of observations in the category out of the total number
of nonmissing responses?
A. percent C. valid percent
B. frequency D. cumulative percent
___19.
In the frequency table above the value 84.0 in the last column indicates ______.
A. cumulative percentage of freshman, sophomore and junior
B. cumulative percentage of all grade level
C. frequency of freshman and sophomore
D. percentage of junior and senior
24
___20. To run a create a frequency table, which of the following should be click
sequentially in the data view of SPSS.
A. Analyze > Descriptive Statistics > Frequencies
B. Analyze > Descriptive Statistics > Descriptives
C. Analyze > Descriptive Statistics > Crosstabs
D. Analyze > Compare Means > Means
Congratulations in finishing Module 3 for Quarter 3! Make sure that you have
accomplished all the activities. You can always ask your teacher if there are some
parts which you find hard to understand.
Answer Key
20. A 20. A
19. B 19. A
18. B 18. C
17. A 17. A
16. B
16. C
15. B
15. B
14. B
14. C
13. C
13. A
12. B
12. D
11. A
11. B
10. B
10. A 9. A
9. B 8. D
8. C 7. A
7. D 6. C
6. C criteria…
depending on the given 5. C
5. C Answer may vary. Check 4. D
4. B Activity 2 3. C
3. D 2. B
2. C criteria…
1. A
depending on the given
1. A Answer may vary. Check Written Task
What I Know Activity 1 Summative Test
25
References
26