MMW - Midterm - Modules - DATA MANAGEMENT
MMW - Midterm - Modules - DATA MANAGEMENT
DATA
MANAGEMENT
LEARNING OBJECTIVES
1
LESSON 01
The Elements, Components and Role of Statistics
LET’S PROCESS
Statistics is defined as a mathematical body of science that pertains to the collection,
organization and analysis, interpretation or explanation, and presentation of data, in such a
way that meaningful conclusions can be drawn from them. It is a crucial process behind how
we make discoveries in science, make decisions based on data, and make predictions. In a
simpler way of definition, Statistics are numbers with a context or underlying meaning.
We will refer to information collected from people or objects as data. Actually, data is
the plural form and a single piece of information is called datum, although data is now
commonly used to represent one or more pieces of information. Data is defined as the collection
of facts, such as numbers, words, measurements, observations or just descriptions of things.
Data can either be qualitative or quantitative.
Moreover, data are individual pieces of factual information recorded and used for the
purpose of analysis. It is the raw information from which statistics are created. Statistics are
the results of data analysis - its interpretation and presentation. These types of statistics are
referred to as 'statistical data'.
Why do we care about data? We collect data to help us learn about our world. We start
off with some questions about something we wish to learn about, and we find appropriate data
to help us answer these questions.
• Discrete data can only take certain values (like whole numbers)
• Continuous data can take any value (within a range)
Source: www.mathsisfun.com
Qualitative:
2
Quantitative:
• Discrete:
• He has 4 legs
• He has 2 brothers
Continuous:
• He weighs 25.5 kg
• He is 565 mm tall
Qualitative:
Quantitative:
• Discrete:
• He has 4 legs
• He has 2 brothers
Continuous:
• He weighs 25.5 kg
• He is 565 mm tall
1. FORMULATING QUESTIONS: First, state some questions or problems that you would
like to address by collecting relevant data.
2. COLLECTING DATA: Second, specify effective ways of collecting data that are useful in
answering the questions of interest.
3. ORGANIZING & SUMMARIZING: Next, organize and summarize the collected data to
learn about its general features.
4. MAKING CONCLUSIONS: Last, use the data to make conclusions. (It turns out that
probability or chance plays an important role in decision-making.)
Any statistical study reported in the media will have these four components. At the
beginning, there will be some questions that motivated the researcher to study a problem. If
there were no questions, then there would be no reason to proceed further into a statistical
study. Second, the researcher will collect data that he or she believes will be useful in
answering the question. You will see that data can be collected or found from many
sources. Next, the researcher organizes the data in some useful way and make graphs and or
calculations that are helpful in answering the main questions. Finally, the researcher has to
use the graphs and calculations to address the questions of interest. It is possible that the
data are insufficient or inconclusive on answering the questions and perhaps a new statistical
study will be undertaken.
For example, you want to work on the completion rate of QSU among its different
programs for the past five years. You can work on it observing the above-mentioned
components.
Formulating Questions
3
You can get started by looking at some of the completion rate data. Most likely, those
who enrolled in college had their own ambitions and dreams and are determined to finish
college. Ironically, there are those who failed to finish or pursue their careers. There were
those who ended up dropping or discontinuing from their course or program. Unfortunately,
the number of those who graduate do not compliment the number of enrolled during the very
1st semester of the program.
Basically, you can ask,
1. How many from those who enrolled in each program were able to graduate?
2. How many of these students finish the program or their course within the prescribed
time?
3. Who are those people who didn’t make it till graduation?
4. What might have contributed to their non-completion of the program?
5. Which program has the highest or the lowest completion rate?
Collecting Data
In order to find answer to you queries, you need to identify where to find data.
Possibilities include but are not limited to:
The possible sources of data listed earlier are intended only for the given topic above.
There are however other sources of data suitable for each topic you intended to make. Other
ways to collect data also include direct observation, interview and document analysis. Take
note that when you are collecting data you are simply doing a survey.
Organizing & Summarizing
Once you have all the data, you can organize them by extracting information following
each of the question you have formulated. For example:
Drawing Conclusions
With the available data you organized and summarized, you can now work on your
conclusions by answering each of the questions you formulated.
An analysis of the result of your study can help you traced the roots or causes of certain
completion rate issues and concerns which can be used as a basis in reviewing and enhancing
each program of QSU.
In the example you worked on, you have collected different types of information from
each graduate. The object that you are collecting information from is called the observational
unit. In this case, the observational unit is the graduates. The different types of information
you collect for each graduate are called variables. Here some variables are the names of the
programs, number of graduates for each program, the percentage of completion rate within the
prescribed time and the percentage of completion rate per program. There are two distinct
types of variables depending on how the variable is recorded. The name of the program is an
example of a categorical variable -- this is in which its values can be grouped into different
4
categories. The percentage of completion rate from each program is a quantitative variable --
this is a variable where the values are numerical and refer to the quantity or size of something.
For our second example, suppose we record the current status of learning readiness
under the new normal setting of the Bachelor of Science in computer Science students.
Here the student would be the observational unit. Learning readiness and addresses of
students would be a categorical variable, and the number of those having gadgets and
connectivity and those who have none would be quantitative variables.
Sometimes it can be difficult to tell if a collected “number” is a categorical or a
quantitative variable. Nevertheless, when we collect data, it is important to recognize if a given
data value represents a categorical or quantitative variable. Our exploration of data will depend
on its type. The way we explore categorical data will be fundamentally different from our
treatment of quantitative.
Statistics represent a common method of presenting information helping us to
understand what the data are telling us. Statistical knowledge helps you use the proper
methods to collect the data, employ the correct analyses, and effectively present the results.
Basically, it helps us understand the world a little bit better through numbers and other
quantitative information. Below are but some examples of the importance and uses of statistics
in our daily lives and the events happening around us.
1. Statistics teaches people to use a limited sample to make intelligent and accurate
conclusions about a greater population. The goal of virtually all quantitative research
studies is to identify and describe relationships among constructs. The use of tables,
graphs, and charts play a vital role in presenting the data being used to draw these
conclusions.
2. Statistics is one of the most important disciplines to provide tools and methods to find
structure in and to give deeper insight into data, and the most important discipline to
analyze and quantify uncertainty. Data are collected in a very systematic manner and
conclusions are drawn based on the data.
SELF–CHECK!
Now that you were introduced to the elements, components and importance of
statistics, let us try to gauge how far have you gone in understanding the
discussions. Please write your answers in your big notebook specific for the course.
Complete the following statements below by filling the blanks with the appropriate
word/s.
As you walk, or in the car or at home, look around and ask yourself questions about the
world around you. Then write down 5 of those questions then identify the qualitative and
quantitative aspects of each.
Examples:
1. How many trees are there?
2. How many houses are there?
3. How may are busy working? Bystanders?
4. Which stores/outlet are selling the most?
Use the matrix below in doing your activity. Referring to questions above you
should be able to extract information from the given observable unit.
1. Statistics
2. collection,
3. organization
4. explanation
5. data
6. Data
7. Numbers
8. observations
9. qualitative
10.quantitative
11. descriptive
12. numerical
13. Organizing and summarizing
14. observational unit
LESSON 02
The Research Process
LET’S PROCESS
As cited by Calderon and Gonzales (1993), research in general is a systematic,
refined, careful, critical and disciplined inquiry varying in method directed to the
clarification and/or resolution of a problem.
RESEARCH…
• starts with a problem,
• collects data or facts,
• analyzes and interprets these critically and
• reaches a decision based on actual evidence.
• A research method is a process through which you are going to move the reader from
questions, to data, to findings, and to conclusion. Findings become unconvincing if
process is poor and research cannot be replicated if the process is unclear.
The typical quantitative study involves a series of steps, one of which is statistical
analysis.
• RESEARCH…
• starts with a problem,
7
• collects data or facts,
• analyzes and interprets these critically and
• reaches a decision based on actual evidence.
SELF–CHECK!
Can you recall some fundamental concepts of research? Consider doing the
following activity.
I. Supply the missing correct word/s in the paragraph below about research.
1. Ten female respondents answered to the survey with a YES or NO while 15 male
respondents answered either YES or NO.
2. Competencies of graduates do not match the industries preference.
3. What is the level of awareness of students with the QSU VMGO?
4. Descriptive statistics will be used in analyzing the data.
5. The respondents of the study will be the BSCS students.
iCONNECT
Go over scholarly articles at google scholar.com or www.tandfonline.com for any
flexible learning or IT related topics. This will help you be more familiar with the
processes of research.
9
DEBUG YOUR SKILLS
Using the knowledge you gained in the research process, provide the details of
the following topics basing on and observing the research process.
LESSON 03
Statistical Terminology
LET’S PROCESS
We can spend the whole term or semester in defining the different
terminologies you will come across as you study statistics. Thus, this study will
focus on the terms basically used in statistics. These are population, sample, parameter and
variables.
Population
➢ It is the entire set of individuals or objects that you are interested in studying.
➢ The group that you want to generalize your results to.
➢ It can vary in sizes, they are usually quite large that’s why it is usually not feasible to
collect data from the entire population.
Sample
➢ It is a subset of individuals selected from the population.
➢ In the best case, the sample will be representative of the population.
➢ The characteristics of the individuals in the sample will mirror those in the population.
Parameter
➢ It is a quantitative characteristic of the population that you’re interested in estimating
or testing (such as a population mean or proportion).
➢ These are generally unknown, and must be estimated from a sample.
➢ The sample estimate is called a statistic
Examples: retention rate, average level of self-efficacy, body image, preference and perception
Variables
➢ A characteristic that takes on different values for different individuals in a sample.
Example:
▪ Retention (yes/no)
▪ Inquiry about QSU programs (yes/no)
▪ Self-efficacy (score on self-efficacy questionnaire)
▪ Body image (score on body image questionnaire)
▪ Flexible learning (percentage of preferred mode of teaching delivery in a survey)
▪ School Opening (number of favorable for school opening in a survey)
➢ Any characteristics, number, or quantity that can be measured or counted. It is also
called a data item.
10
Examples: age, sex, business income and expenses, country of birth, capital
expenditure, class grades, eye color and vehicles.
➢ Its value may vary between data units in a population, and may change in value over
time.
3. CONFOUNDING VARIABLES
➢ represent unwanted sources of influence on the DV
➢ sometimes referred to as “nuisance” variables.
Example: Does a new curriculum improve body image?
Such things as heredity, family background, previous counseling experiences,
etc. can also impact the DV.
iCONNECT
To further your understanding on the dependent and independent variables, you
may go over the article by Market Research Guy found exhibited here as Appendix A or if you
are online and if possible you can click on this site:
https://www.mymarketresearchmethods.com/dependent-independent-variables-whats-
difference/.
SELF–CHECK!
Now that you’re through with the lesson, let us try checking your knowledge on
the following:
1. The four (4) basic terms used in statistics
2. It is the entire set of individuals or objects that you are interested in
studying.
3. It is a quantitative characteristic of the population that you’re interested in
estimating or testing (such as a population mean or proportion).
4. It is a subset of individuals selected from the population.
5. Any characteristics, number, or quantity that can be measured or counted. It
is also called a data item.
6. common variable Types
11
LET’S TRY THIS!
Activity 1 (Identifying IVs and DVs)
LESSON 04
Scales of Measurement
LET’S PROCESS
For any given variable that you are interested in, there may be a variety of
measurement scales that can be used. Variable measurement is the second factor
that influences the choice of statistical procedure. Say,
What is your annual income? _________
What is your annual income?
a. 10,000-20,000
b. 20,000-30,000
c. 30,000-40,000
d. 40,000-50,000
e. 50,000 or above
Scales of measurement can be nominal, ordinal, interval or ratio. In nominal scale,
observations fall into different categories or groups and differences among categories are
qualitative, not quantitative. Examples are gender, ethnicity, counseling method (cognitive vs.
humanistic), retention (retained vs. not retained).
On the other hand, class standing, letter grades (A,B,C,D,F) and Likert-scale survey
responses (SD, D, N, A, SA) are examples of ordinal scale. In this scale, categories can be rank
ordered in terms of amount or magnitude. Also, categories possess an inherent order, but the
amount of difference between categories is unknown.
In interval scale, categories are ordered, but now the intervals for each category are
exactly the same size. That is, the distance between measurement points represent equal
12
magnitudes (e.g., the distance between point A and B is the same as the distance between B
and C). The examples of this scale could include Fahrenheit scale of measuring temperature,
chronological scale of dates (1997 A.D.) and Standard scores (z-scores).
Moreover, ratio scale has same properties as the interval scale, but with an additional
feature. Ratio scale has an absolute 0 point which permits the use of ratios (e.g., A is “twice
as large” as B). Examples of this scale are number of children, weight, height, annual income,
etc.
There are different ways variables can be described according to the ways they can be
studied, measured and presented. In practice, it is not usually necessary to make such fine
distinctions between measurement scales for two distinctions, categorical and continuous are
usually sufficient.
Level of Measurement
Source: abs.gov.au/websitedbs
There are different ways variables can be described according to the ways they can be
studied, measured, and presented.
Numeric variables have values that describe a measurable quantity as a number, like
'how many' or 'how much'. Therefore, numeric variables are quantitative variables. Numeric
variables may be further described as either continuous or discrete:
➢ A continuous variable is a numeric variable. Observations can take any value between a
certain set of real numbers. The value given to an observation for a continuous variable
can include values as small as the instrument of measurement allows. Continuous
variables are generally preferable because a wider range of statistical procedures can be
applied. Continuous variables yield values that fall on a numeric continuum, and can
(theoretically) take on an infinite number of values. Examples of continuous variables
include height, time, age, and temperature. Further examples include:
What is the level of measurement of:
▪ Temperature OC?
▪ Color?
▪ Income of professional basketball players?
▪ Degree of importance (1 = not important, 5 = very important)
Note: The data collected for a numeric variable are quantitative data.
13
Categorical variables have values that describe a 'quality' or 'characteristic' of a data
unit, like 'what type' or 'which category'. Categorical variables fall into mutually exclusive
(in one category or in another) and exhaustive (include all possible options)
categories. Therefore, categorical variables are qualitative variables and tend to be
represented by a non-numeric value. Categorical variables consist of separate, indivisible
categories
➢ An ordinal variable is a categorical variable. Observations can take a value that can be
logically ordered or ranked. The categories associated with ordinal variables can be
ranked higher or lower than another, but do not necessarily establish a numeric
difference between each category. Examples of ordinal categorical variables include
academic grades (i.e. A, B, C), clothing size (i.e. small, medium, large, extra large) and
attitudes (i.e. strongly agree, agree, disagree, strongly disagree).
➢ A nominal variable is a categorical variable. Observations can take a value that is not
able to be organized in a logical sequence. Examples of nominal categorical variables
include sex, business type, eye color, religion and brand.
Note: The data collected for a categorical variable are qualitative data.
SELF–CHECK!
Let’s review on some basic aspects of the lesson.
1. List the four (4) scales of measurement.
2. What are the ways in which variables can be presented?
1. As part of the requirements for the admission to the university, students need to
take the English Proficiency Test (EPT) which scores can range from 75 to 150 with
a population mean of 500 and a standard deviation of 100.
2. Children in an elementary school were evaluated for their reading readiness
through the PHIL-IRI.
3. During an interview with the survivors of an earthquake are asked to state “yes” or
“no” as to whether they experienced Post-Traumatic Stress Disorder (PTSD). The
number “0” is assigned to “no” and “number “1” is assigned to “yes”.
4. A certain university wants to know the dormitory preference of the students. The
administrators assigned a rank to each dorm based on applications received.
5. A researcher wants to determine whether the temperature of the customers
recorded in the logbook of a supermart to compare the temperature of older
customers and the younger customers.
15
MODULE 06
WORKING WITH
DATA ON SPSS
LEARNING OBJECTIVES
We suggest that you first obtain the SPSS license code before you begin downloading
SPSS. Obtain the SPSS license code from this link:
http://ezp.waldenulibrary.org/limited/spsslicense.html You will need to enter your Walden
user name and password if you are not already logged into the Library or Blackboard. Simply
copy and paste the code into a Word document so that you have it available when prompted
to enter it at the end of the installation sequence. You can always enter the code later; however,
having it on hand to enter during the installation is much easier. SPSS Statistics software
installation link for Windows is given below. Access the appropriate installation link depending
on your operating system: SPSS v23 Windows 32-bit install:
http://mym.cdn.laureatemedia.com/2dett4d/software/IBM/SPSS/v23/Windows/32-
bit/SPSSSC_32- BIT_23.0_MW_ML.zip
16
SPSS v23 Windows 64-bit install
- http://mym.cdn.laureatemedia.com/2dett4d/software/IBM/SPSS/v23/Windows/64-
bit/SPSSSC_64- BIT_23.0_MW_ML.zip
Not sure which if you have a 32-bit or 64-bit Windows Operating System? Access
Microsoft’s page for further clarification. This installation requires at least 1GB of free space
on your computer. Because of the large size of the installation file, it is recommended that you
are on a DSL or better internet connection. Even with a strong internet connection, the
installation may still take up to 30 minutes or longer. While the tool is installing, you may
continue to work within other applications on your computer.
Does your course require SPSS AMOS? If yes, access the install link and license code.
If you are using a Mac operating system, you may follow the instructions which can be found
in page 2 of Appendix B.
LET’S PROCESS
What is SPSS?
Statistical Package for the Social Sciences (SPSS) is used by various kinds of
researchers for complex statistical data analysis. The software package was created for
the management and statistical analysis of social science data. It was originally
launched in 1968 by SPSS Inc., and was later acquired by IBM in 2009. Most users
refer to it as SPSS though it is officially dubbed IBM SPSS Statistics. As the world
standard for social science data analysis, SPSS is widely coveted due it’s straightforward
and English-like command language and impressively thorough user manual.
Basically, SPSS first store and organize the provided data, then it compiles the
data set to produce suitable output. SPSS is designed in such a way that it can handle
a large set of variable data formats.
SPSS is used by market researchers, health researchers, survey companies,
government entities, education researchers, marketing organizations, data miners, and
many more for the processing and analyzing of survey data. Most top research agencies
use SPSS to analyze survey data and mine text data so that they can get the most out
of their research projects.
There are a handful of statistical methods that SPSS can provide help with including:
➢ Descriptive statistics, including methodologies such as frequencies, cross tabulation,
and descriptive ratio statistics.
➢ Bivariate statistics, including methodologies such as analysis of variance (ANOVA),
means, correlation, and nonparametric tests.
➢ Numeral outcome prediction such as linear regression.
➢ Prediction for identifying groups, including methodologies such as cluster
analysis and factor analysis.
17
Why use SPSS?
SPSS is an extremely powerful tool for manipulating and deciphering survey
data. Exporting survey data to SPSS’s proprietary .SAV format makes the process of pulling,
manipulating, and analyzing data clean and easy. By so doing, SPSS will automatically set up
and import designated variable names, variable types, titles, and value labels, meaning that
minimal legwork is required from researchers. Once survey data is exported to SPSS, the
opportunities for statistical analysis are practically endless. In short, when you need a flexible,
customizable way to get super granular on even the most complex data sets, use SPSS. This
gives you more time to do what you do best and identify trends, develop predictive models, and
draw informed conclusions.
SELF–CHECK!
1. SPSS stand for ________________ ____________________ _______________
__________.
2. It was originally launched in __________ and acquired by _________ in 2009.
3. SPSS is used by (give at least three users).
4. SPSS is a powerful tool for _________________ and ______________ of survey data.
5. The core functions of SPSS include:
18
MCGREGOR Male 33 58000 Non- 106.12 78.25
smoker
KUMAR Male 38 47000 Smoker 88.11 102.45
ALLINSON- Female 51 55000 Non- 83.62 63.82
HENRY smoker
OLDER Male 44 28000 Non- 72.31 77.50
smoker
Save the data as pilotgroup.sav.
19
5. Now select the patients whose hormone levels were greater after the treatment than
they were before.
We need to recode some of the data in medicaltrialX.sav (or fixed.sav). The information in the
smoker column is coded as text (Y and N). It would be better to code it as numeric data (e.g. 1
and 0), and to use labels to indicate the meaning of these numbers.
1. Use Recode to convert the smoker information into a new variable called smoker1, so
that 'Y' becomes 1 and 'N' becomes 0. (Define the label for this new variable as
"Smoker or non-smoker")
2. Create value labels for the smoker1 variable so that 1 is displayed as 'Smoker', and 0
is displayed as 'Non-smoker'.
3. Check that the recode has worked properly. If it has, then delete the old variable
smoker from your data sheet.
4. Using Recode, create a new variable incband (with label "Income band") to categorise
the household income: up to $25,000 as band 1, between $25,000 and $40,000 as
band 2, and more than $40,000 as band 3.
5. Save the data file if you are happy with the results of this exercise.
20
Objects from the SPSS Viewer can be copied and pasted into other applications (e.g.
Microsoft Word or Excel). If you have time, open a new document in Microsoft Word. Copy
and paste one of the histograms which you have just produced into the Word document.
Save it with the title histogram.doc. Do not save the SPSS Viewer output.
21
Exercise 13: T Tests
Use the medicaltrialX data you have been working on - or you may wish to load the data file
medicaltrialX-part2.sav.
1. Perform a T Test to show whether there is a significant difference in the inital hormone
levels (hbefore) between men and women. Is there?
2. Perform a T Test across all the cases, to decide whether there is a significant difference
in the mean hormone concentrations before and after the treatment.
3. Now use the Select cases function to select only the women in the study, and repeat
step (2). What do you find?
If you have time, repeat the test, selecting men instead of women.
22
researchers which gives you more time to do what you do best and identify trends, develop
predictive models, and draw informed conclusions.
23
MODULE 07
DESCRIPTIVE
STATISTICS
LEARNING OBJECTIVES
LESSON 01
Measures of Central Tendency
LET’S PROCESS
The study of statistics can be categorized into two main branches. These
branches are descriptive statistics and inferential statistics.
To collect data for any statistical study, a population must first be defined.
'Population' indicates a group that has been designated for gathering data from.
The data is information collected from the population. A population is not necessarily
referring to people. A population could be a group of people, measurements of rainfall
in a particular area or a batch of batteries.
Descriptive statistics give information that describes the data in some manner.
For example, suppose a pet shop sells cats, dogs, birds and fish. If 100 pets are sold,
and 40 out of the 100 were dogs, then one description of the data on the pets sold would
be that 40% were dogs.
24
This same pet shop may conduct a study on the number of fish sold each day for
one month and determine that an average of 10 fish were sold each day. The average is
an example of descriptive statistics.
Some other measurements in descriptive statistics answer questions such as
'How widely dispersed is this data?', 'Are there a lot of different values?' or 'Are many of
the values the same?', 'What value is in the middle of this data?', 'Where does a
particular data value stand with respect with the other values in the data set?'
A graphical representation of data is another method of descriptive statistics.
Examples of this visual representation are histograms, bar graphs and pie graphs, to
name a few. Using these methods, the data is described by compiling it into a graph,
table or other visual representation.
This provides a quick method to make comparisons between different data sets
and to spot the smallest and largest values and trends or changes over a period of time.
If the pet shop owner wanted to know what type of pet was purchased most in the
summer, a graph might be a good medium to compare the number of each type of pet
sold and the months of the year.
Three measures of central tendency
i. Mean
➢ The mean is simply the arithmetic average.
➢ The mean would be the amount that each individual would get if
we took the total and divided it up equally among everyone in the
sample
➢ Alternatively, the mean can be viewed as the balancing point in
the distribution of scores (i.e., the distances for the scores above
and below the mean cancel out)
ii. Median
➢ The median is the score that splits the distribution exactly in half
➢ 50% of the scores fall above the median and 50% fall below
➢ The median is also known as the 50th percentile, because it is the
score at which 50% of the people fall below
A desirable characteristic of the median is that it is not affected by extreme
scores. Thus, the median is not distorted by skewed distributions.
Example: Sample 1: 18, 19, 20, 22, 24
Sample 2: 18, 19, 20, 22, 47
iii. Mode
➢ The mode is simply the most common score.
➢ There is no formula for the mode
➢ When using a frequency distribution, the mode is simply the score
(or interval) that has the highest frequency value
➢ When using a histogram, the mode is the score (or interval) that
corresponds to the tallest bar
Unfortunately, no single measure of central tendency works best in all circumstances.
Nor will they necessarily give you the same answer.
25
LESSON 02
Measures of Variability
LET’S PROCESS
The fluctuation of scores about a central tendency is called “variability.
We can use measures of variability to compare two sets of scores.
Although the means may be the same, the distribution may be different.
Measure of Variability
1. Range
➢ Range is the distance between two extreme scores.
➢ ! It informs us about the dispersion of our distribution.
➢ The larger the range the larger the dispersion from the mean value.
➢ Although the mean of the scores of two distributions can be identical their
ranges may be different.
Drawbacks to the Range
Good preliminary measure, but one single extreme value can influence the range
significantly. The calculation of the range is derived from the highest and lowest values and
doesnʼt tell us anything about the variability of the different values.
2. Standard Deviation
➢ Defined as the variability of the scores around the mean
➢ Each score in a distribution varies from the mean by a greater or lesser
amount, except when the score is the same as the mean.
➢ Deviations from the mean can be noted as either positive or negative deviations
from the mean.
➢ The average of these deviations would equal “ zero.
3. Variance
➢ The variance and the closely-related standard deviation are measures of how
spread out a distribution is.
26
LESSON 03
Frequency Distribution
LET’S PROCESS
After collecting data, researchers are faced with pages of unorganized numbers,
stacks of survey responses, etc. The goal of descriptive statistics is to aggregate the
individual scores (datum) in a way that can be readily summarized. A frequency
distribution table can be used to get “picture” of how scores were distributed.
Frequency distribution showing the ages of students who took the online course.
Frequency table of Student responses when asked whether or not they would
recommend the online course to others.
28
29