0% found this document useful (0 votes)
21 views46 pages

Experimental Lesson3 Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views46 pages

Experimental Lesson3 Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Experimental

psychology
Kenneth Charles d. Bermejo, rpm, lpt
Week three
Lesson 3
What are statistics?
Statistics is the science of conducting studies to
collect, organize, summarize, analyze, and draw
conclusions from data.
Why do we study
statistics?
Without statistics, we would be unable to interpret the massive amounts of information
contained in data. Even small datasets contain hundreds – if not thousands – of
numbers, each representing a specific observation we made. Without a way to organize
these numbers into a more interpretable form, we would be lost, having wasted the time
and money of our participants, ourselves, and the communities we serve
Suppose that an insurance company studies its records over the past
several years and determines that, on average, 3 out of every 100
automobiles the company insured were involved in accidents during a
1-year period. Although there is no way to predict the specific
automobiles that will be involved in an accident (random occurrence),
the company can adjust its rates accordingly, since the company
knows the general pattern over the long run. (That is, on average, 3%
of the insured automobiles will be involved in an accident each year.)
What Is Descriptive Statistics?
• Descriptive Statistics describes the characteristics of a data set. It is a
simple technique to describe, show and summarize data in a
meaningful way. You simply choose a group you’re interested in,
record data about the group, and then use summary statistics and
graphs to describe the group properties. There is no uncertainty
involved because you’re just describing the people or items that you
actually measure. You’re not aiming to infer properties about a large
data set.
• Descriptive statistics involves taking a potentially sizeable number of
data points in the sample data and reducing them to certain
meaningful summary values and graphs. The process allows you to
obtain insights and visualize the data rather than simply pouring
through sets of raw numbers. With descriptive statistics, you can
describe both an entire population and an individual sample.
What Is Inferential Statistics?
• In Inferential Statistics, the focus is on making predictions about a
large group of data based on a representative sample of the
population. A random sample of data is considered from a population
to describe and make inferences about the population. This
technique allows you to work with a small sample rather than the
whole population. Since inferential statistics make predictions rather
than stating facts, the results are often in the form of probability.
• The accuracy of inferential statistics depends largely on the
accuracy of sample data and how it represents the larger population.
This can be effectively done by obtaining a random sample. Results
that are based on non-random samples are usually discarded.
Random sampling - though not very straightforward always – is
extremely important for carrying out inferential techniques.
Types of data and how to
collect them
Types of variable
Types of variable
• The dependent variable is the variable that is being
measured or tested in an experiment. For example, in a
study looking at how tutoring impacts test scores, the
dependent variable would be the participants' test
scores since that is what is being measured.
• This is different than the independent variable in an
experiment, which is a variable that stands on its own.
In the example above, the independent variable would
be tutoring. The independent variable (tutoring) doesn't
change based on other variables, but the dependent
variable (test scores) may.
independent variable Dependent variable
• Variable being • Variable being
manipulated measured
• Doesn't change based • May change based on
on other variables other variables
• Stands on its own • Depends on other
variables
Levels of independent
variable
Qualitative and quantitative variables
QUALITATIVE VARIABLES
• Qualitative variables are those that express a
qualitative attribute such as hair color, eye color,
religion, favorite movie, gender, and so on. The values
of a qualitative variable do not imply a numerical
ordering. Values of the variable “religion” differ
qualitatively; no ordering of religions is implied.
Qualitative variables are sometimes referred to as
categorical variables.
QUANTITATIVE VARIABLES
• Quantitative variables are numerical and can be
ordered or ranked. For example, the variable age is
numerical, and people can be ranked in order according
to the value of their ages. Other examples of
quantitative variables are heights, weights, and body
temperatures.
• Quantitative variables can be further classified into two
groups: discrete and continuous.
DISCRETE AND CONTINUOUS
VARIABLES
• Discrete variables assume values that can be
counted.
• Continuous variables can assume an infinite number
of values between any two specific values. They are
obtained by measuring. They often include fractions and
decimals.
CONTINUOUS VARIABLES
• Since continuous data must be measured, answers must be rounded
because of the limits of the measuring device. Usually, answers are rounded
to the nearest given unit. For example, heights might be rounded to the
nearest inch, weights to the nearest ounce, etc. Hence, a recorded height of
73 inches could mean any measure from 72.5 inches up to but not including
73.5 inches.
• Thus, the boundary of this measure is given as 72.5–73.5 inches. Boundaries
are written for convenience as 72.5–73.5 but are understood to mean all
values up to but not including 73.5. Actual data values of 73.5 would be
rounded to 74 and would be included in a class with boundaries of 73.5 up
to but not including 74.5, written as 73.5–74.5. As another example, if a
recorded weight is 86 pounds, the exact boundaries are 85.5 up to but not
including 86.5, written as 85.5–86.5 pounds. Table 1–1 helps to clarify this
concept. The boundaries of a continuous variable are given in one additional
decimal place and always end with the digit 5.
CONTINUOUS VARIABLES

Variable Recorded value Boundaries

Length 15 centimeters (cm) 14.5 - 15.5 cm

Temperature 86 degrees Fahrenheit (F) 85.5 - 86.5 F

Time 0.43 second (sec) 0.425 - 0.435 sec

Mass 1.6 grams (g) 1.55 - 1.65 g


Measurement Scales
• The nominal level of measurement classifies data into mutually
exclusive (non overlapping) categories in which no order or ranking
can be imposed on the data.
• The ordinal level of measurement classifies data into categories
that can be ranked; however, precise differences between the ranks
do not exist.
• The interval level of measurement ranks data, and precise
differences between units of measure do exist; however, there is no
meaningful zero.
• The ratio level of measurement possesses all the characteristics
of interval measurement, and there exists a true zero. In addition,
true ratios exist when the same variable is measured on two different
members of the population.
Measurement Scales
• Before we can conduct a statistical analysis, we need to measure our
dependent variable. Exactly how the measurement is carried out depends
on the type of variable involved in the analysis. Different types are
measured differently. To measure the time taken to respond to a stimulus,
you might use a stop watch. Stop watches are of no use, of course, when
it comes to measuring someone's attitude towards a political candidate. A
rating scale is more appropriate in this case (with labels like “very
favorable,” “somewhat favorable,” etc.). For a dependent variable such as
“favorite color,” you can simply note the color-word (like “red”) that the
subject offers. Although procedures for measurement differ in many ways,
they can be classified using a few fundamental categories. In a given
category, all of the procedures share some properties that are important
for you to know about. The categories are called “scale types,” or just
“scales,” and are described in this section
Nominal scales
• When measuring using a nominal scale, one simply
names or categorizes responses. Gender, handedness,
favorite color, and religion are examples of variables
measured on a nominal scale. The essential point about
nominal scales is that they do not imply any ordering
among the responses. For example, when classifying
people according to their favorite color, there is no
sense in which green is placed “ahead of” blue.
Responses are merely categorized. Nominal scales
embody the lowest level of measurement.
Ordinal scales
• A researcher wishing to measure consumers' satisfaction with
their microwave ovens might ask them to specify their feelings
as either “very dissatisfied,” “somewhat dissatisfied,”
“somewhat satisfied,” or “very satisfied.” The items in this scale
are ordered, ranging from least to most satisfied. This is what
distinguishes ordinal from nominal scales. Unlike nominal scales,
ordinal scales allow comparisons of the degree to which two
subjects possess the dependent variable. For example, our
satisfaction ordering makes it meaningful to assert that one
person is more satisfied than another with their microwave
ovens. Such an assertion reflects the first person's use of a
verbal label that comes later in the list than the label chosen by
the second person.
Interval scales
• Interval scales are numerical scales in which intervals have the same
interpretation throughout. As an example, consider the Fahrenheit scale
of temperature. The difference between 30 degrees and 40 degrees
represents the same temperature difference as the difference between
80 degrees and 90 degrees. This is because each 10-degree interval has
the same physical meaning (in terms of the kinetic energy of molecules).
• Interval scales are not perfect, however. In particular, they do not have a
true zero point even if one of the scaled values happens to carry the
name “zero.” The Fahrenheit scale illustrates the issue. Zero degrees
Fahrenheit does not represent the complete absence of temperature (the
absence of any molecular kinetic energy). In reality, the label “zero” is
applied to its temperature for quite accidental reasons connected to the
history of temperature measurement. Since an interval scale has no true
zero point, it does not make sense to compute ratios of temperatures.
Interval scales
• For example, there is no sense in which the ratio of 40 to 20
degrees Fahrenheit is the same as the ratio of 100 to 50 degrees;
no interesting physical property is preserved across the two
ratios. After all, if the “zero” label were applied at the
temperature that Fahrenheit happens to label as 10 degrees, the
two ratios would instead be 30 to 10 and 90 to 40, no longer the
same!
• For this reason, it does not make sense to say that 80 degrees is
“twice as hot” as 40 degrees. Such a claim would depend on an
arbitrary decision about where to “start” the temperature scale,
namely, what temperature to call zero (whereas the claim is
intended to make a more fundamental assertion about the
underlying physical reality)
RATIO SCALES
• The ratio scale of measurement is the most informative scale. It is
an interval scale with the additional property that its zero position
indicates the absence of the quantity being measured. You can
think of a ratio scale as the three earlier scales rolled up in one.
Like a nominal scale, it provides a name or category for each object
(the numbers serve as labels). Like an ordinal scale, the objects are
ordered (in terms of the ordering of the numbers).
• Like an interval scale, the same difference at two places on the
scale has the same meaning. And in addition, the same ratio at two
places on the scale also carries the same meaning. The Fahrenheit
scale for temperature has an arbitrary zero point and is therefore
not a ratio scale. However, zero on the Kelvin scale is absolute zero.
This makes the Kelvin scale a ratio scale.
RATIO SCALES
• For example, if one temperature is twice as high as another
as measured on the Kelvin scale, then it has twice the
kinetic energy of the other temperature.
• Another example of a ratio scale is the amount of money
you have in your pocket right now (25 cents, 55 cents, etc.).
Money is measured on a ratio scale because, in addition to
having the properties of an interval scale, it has a true zero
point: if you have zero money, this implies the absence of
money. Since money has a true zero point, it makes sense to
say that someone with 50 cents has twice as much money
as someone with 25 cents (or that Bill Gates has a million
times more money than you do).
Sampling techniques
Kenneth Charles d. Bermejo, rpm, lpt
Probability sampling
What is probability sampling?
Probability sampling is a sampling technique that involves randomly selecting a
small group of people (a sample) from a larger population, and then predicting
the likelihood that all their responses put together will match those of the overall
population.

There are two important requirements when it comes to probability sampling:

1. Everyone in your population must have an equal, non-zero chance of being


selected. (In other words, everyone has an equal chance of receiving a survey.)
2. You must know, specifically, what that chance of being selected is for each
person. (For example, you might determine that in a population of 100 people,
each person’s odds of receiving a survey is 1 in 100. Being able to represent
each person’s chance of selection as a probability is at the core of probability
sampling.)
What is probability sampling?
• Following these two rules will help you choose appropriately (i.e.
randomly) from your sampling frame, which is the list of everyone
in your entire population who can be sampled. Random selection is
key—probability sampling is all about making sure everyone has an
equal probability of being included. From picking names out of a hat
or pulling the short straw, to more complex random selection
processes, this ensures that the sample you end up creating is
representative of the population as a whole.
• With the right sample, you can achieve results that are just as
valuable as those you might get from a far bigger survey effort.
From there, you can draw valid conclusions based on the sample’s
wants, needs, or opinions and take action that makes sense for the
entire population.
Types of probability sampling
• There are several sampling methods that fall under the
umbrella of probability sampling. These methods not
only vary based on the type of research you’re doing
and the type of data you want to yield, but also the
amount of time you have to conduct your research and
the tools you have at your disposal. Here are the four
main types of probability sampling approaches that
researchers use:
Simple random sampling
• In simple random sampling, all members of the
population have an equal chance of being selected and
the selection is done randomly. To achieve this,
researchers may use tools like a random number
generator to select participants from the overall
population to be part of a sample. However, while
simple random sampling is, as the name indicates, the
simplest sampling strategy, it is also prone to bias. For
example, the smaller your sample size is compared to
your overall population, the less likely you are to draw a
reliable sample totally at random.
Stratified random sampling
• Many populations can be divided into smaller groups based on specific
characteristics that don’t overlap but represent the entire population
when put together. With stratified random sampling, you would draw a
sample from each of these groups (or strata) separately. This allows
you to make sure that every subgroup is properly represented, which
leads to more accurate results than simple random sampling.
• It’s common to stratify by characteristics like sex, age, income
bracket, or ethnicity. The strata must be specific and mutually
exclusive, meaning every individual in the population should only be
assigned to one group. Once you’ve split your population into strata,
you would then use simple random sampling to select individuals from
each group, in proportion to the total population. Those individuals
would then be combined into a single sample.
Cluster sampling
• Like stratified sampling, cluster sampling also involves separating the
population into subgroups, or clusters. But that’s where the two probability
sampling methods diverge. With cluster sampling, each cluster should have
similar characteristics to the population. Instead of selecting individuals from
each and every cluster, you would begin by randomly selecting entire
clusters. If possible, you might include every individual from each selected
cluster in your final sample. If the clusters are too large, you would need to
randomly select individuals from each cluster.
• Researchers often use pre-established and easily available groups as clusters.
This is typically based on geographic boundaries, like cities or counties, but it
can also be schools or office locations. Cluster sampling is most often used to
save costs when surveying populations that are very large or spread out
geographically. However, there is more risk of sampling error with cluster
sampling. Each cluster is supposed to represent the total population, but this
can be difficult to guarantee.
systematic sampling
• Systematic sampling is similar to simple random
sampling, though it’s usually a bit easier to conduct.
Each member of the population is assigned a number,
then selected at regular intervals to form a sample.
(Systematic sampling is also known as interval
sampling.) Or, to put it another way, every “nth”
individual in the population is selected to be part of the
sample.
systematic sampling
• For example, in a population of 1,000, you might choose
every 9th person for your sample. This can be more
straightforward than other sampling methods, as there is
a clear and systematic approach to picking individuals that
doesn’t involve a random number generator. On the flip
side, the resulting selection may not be as random as they
would be if a generator was used. Additionally, it’s
important to ensure that there’s no hidden pattern in the
list that may affect the random selection. If there’s risk of
data manipulation, the sample will be skewed and you
may end up with over or under representation within your
sample.
systematic sampling
• For instance, say you plan to survey employees within a
particular organization, and all the employees are listed
in alphabetical order. You plan to use systematic
sampling to select every 4th employee for your sample.
However, if the alphabetical list is also organized by
team and seniority, you might end up choosing too
many or two few people in senior roles, which would
lead to bias into your sample.
Advantages of probability sampling
• There are several benefits to using probability sampling. Overall, it’s cost-
effective to sample large audiences representing your target buying
audience. It’s also advantageous for geographically dispersed populations.

• Each type of probability sampling provides its own advantages. For


example, simple random and systematic sampling makes the
implementation process more user-friendly, and stratified sampling reduces
the researcher’s bias, while cluster sampling limits the variability in a
research study. Probability sampling requires little technical expertise when
utilizing an agile experience management platform. You can also be as
detailed as you want when creating your population sample using stratified
sampling or systematic sampling. If you’re working against deadlines, then
cluster sampling and simple random sampling is the way to go.
Limitations of probability sampling
• For every advantage, some detail within that benefit might
work against your overall efforts. For instance, getting the
best possible population sample means doing a little more
research that will take more time and resources. Stratified
sampling can ensure that the clusters are equally
represented, but it may not mirror all the differences within
that sample population.
• Cluster sampling can separate the strata into diverse
clusters, but those clusters could have overlapping
characteristics. While simple and random probability
sampling can provide quick results, the clusters and strata
might not be as targeted toward your intended audience.
When to use probability sampling
• Probability sampling is ideal for quantitative studies where the
goal is to use statistical analysis to draw conclusions about a
large population. When it would be too difficult or expensive to
survey the entire population, researchers can use this sampling
strategy to collect representative data.
• Probability sampling is used in a lot of market research to gain
insights into a large population. This includes projects like:
• Uncovering consumer usage to inform product development
• Understanding what factors have the greatest impact on
purchasing decisions
• Identifying emerging industry categories and players
When to use probability sampling
• Even beyond industry tracking, buyer attitudes, and
competitive intelligence, probability sampling allows
companies to firm up new ideas and improve business
by tapping into data that reflects their entire target
market.
• Say, for example, a chain of coffee shops has 15,000
stores in various geographic locations in the United
States. The company is looking to expand its customer
loyalty program with additional payment options and
new ways for customers to earn rewards. Before it
makes any significant updates, however, it wants to
know if customers will respond well to the proposed
When to use probability sampling
• Reaching out to all customers at its 15,000 coffee shops isn’t
feasible, but the company could use a probability sampling
approach to create a sample that accurately represents that
larger population. The responses received will reveal how
customers as a whole feel about the loyalty program update. In
turn, everyone from the company’s marketing department to its
customer service representatives can use the data to get a better
understanding of what further changes need to be made or how
to effectively promote the new loyalty program. And if the
company wants to ensure that its sample reflects subgroups
within the population, like gender, age ranges or income levels, it
can use certain types of probability sampling methods like
stratified sampling or cluster sampling (more on both later).
When to use probability sampling
• In the example above, probability sampling is a great
way to handle a rather large population—in this case,
thousands of coffee shops. With true probability
samples, having larger samples helps reduce the
chance of sampling error, which occurs when you select
a sample that does not represent the whole population.
And, in general, random sampling can help minimize
sampling errors because it uses a systematic, rather
than subjective, approach to selecting a sample.
Give everyone a chance at being
selected
• You never want to knowingly exclude someone in your population from being
selected to be part of your sample. Watch out for times when particular groups
might be unintentionally prevented from participating.

• For example, let’s say you want to understand public opinion on an expansive
new immigration law. Will you offer a Spanish language version of your survey?
You should. If you don’t, you’ll likely miss out on hearing from a lot of native
Spanish speakers who aren’t comfortable answering questions in English, but
have views on immigration that would be extremely valuable for your research. If
their participation is overlooked, your survey results won’t match up with true
public opinion.

• Remember, if you can’t give everyone in your population a chance at completing


your survey, your sample will be non-representative and, therefore, will not be
based on probability sampling.
What’s the difference between
probability sampling and non-
probability

sampling?
Simple random sampling, stratified sampling, cluster sampling, and
systematic sampling are all types of probability sampling. But there’s
another end of the sampling technique spectrum: non-probability
sampling. Even if you’re set on using random selection for your sample, it’s
worth knowing the basics of non-probability sampling, including when and
why it’s used by researchers.

• With non-probability sampling, members of the overall population do not


have an equal chance of being part of your sample—and there’s nothing
random about how they are selected. In fact, some members will have zero
chance of being selected. Where probability sampling is concerned with
drawing conclusions about a larger population, non-probability sampling is
often used for exploratory and qualitative research that is more focused on
hearing from people with specific expertise, experiences, or insights.
What’s the difference between
probability sampling and non-
probability

sampling?
Simple random sampling, stratified sampling, cluster sampling, and
systematic sampling are all types of probability sampling. But there’s
another end of the sampling technique spectrum: non-probability
sampling. Even if you’re set on using random selection for your sample, it’s
worth knowing the basics of non-probability sampling, including when and
why it’s used by researchers.

• With non-probability sampling, members of the overall population do not


have an equal chance of being part of your sample—and there’s nothing
random about how they are selected. In fact, some members will have zero
chance of being selected. Where probability sampling is concerned with
drawing conclusions about a larger population, non-probability sampling is
often used for exploratory and qualitative research that is more focused on
hearing from people with specific expertise, experiences, or insights.
What’s the difference between
probability sampling and non-
probability sampling?
• Let’s say, for example, that you’re researching local use of mobility ramps
and your population of interest is people in your city who use wheelchairs.
You don’t have a full list of these people, so probability sampling isn’t an
option. However, you meet a few people who agree to participate in your
study, and they connect you with other wheelchair users in the area. This
non-probability sampling, called snowball sampling, may not involve
random selection, but it does have the potential to put you in contact with
more people who are relevant for your research.

• Non-probability sampling is generally easier and cheaper to conduct, but it


also has a higher risk of sampling bias than probability sampling. That’s
because the sample selection process is based on the subjective judgment
of the researcher, rather than randomization. Plus, the sample size and the
end results don’t necessarily have to represent the entire population.

You might also like