Note-For-Exam - Market Research UHasselt

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 72

QUALITATIVE RESEARCH

1. Introduction
2. Qualitative research techniques:

2.1.Focus group vs in-depth interview


- The research objectives are similar
o To provide data for (re)defining marketing (research) problems
o To gain insight in the results of quantitative findings
o To generate new ideas about products, services or delivery methods
o To understand consumer preferences
- The relative (dis)advantages are typically related to:
o Group dynamics
o Organisation of the research
o Research topic
- Conducting the research:
o Sample size
o Selection of participants
o Guide
- NUMBER OF GROUPS
o At least two groups. It is dangerous to draw conclusions on the data from a single group.
o Continue until saturation point
- GROUP SIZE
o Small (less than 6 respondents)
 more depth and breadth in discussion (+)
 less synergy (-)
 One person may more easily dominate (-)
 Not often used, except when number of respondents is inherently small (e.g. potential
Audi R8 buyers)
o LARGE GROUPS (MORE THAN 8 RESPONDENTS)
 More difficult to control (-)
 Subgroups may emerge (-)
 Distance between moderator and respondents (-)
 Cultural preference. Not used often in Belgium / Netherlands, but in the United
Kingdom people prefer large groups.
o MEDIUM-SIZED GROUPS (N = 6 TO 8)
 Optimal level of group dynamics
 Little chance disintegration
 Most frequently used here
 GROUP SIZE IS NOT AN INDICATOR FOR REPRESENTATIVENESS OF THE FINDINGS
- SELECTION OF PARTICIPANTS
o Specify exact selection criteria
o Make sure the selection criteria are controllable
o Beware of selection bias
o If possible incorporate randomization
o Check respondent knowledge and experience
- Advantages of focus group:
o Synergy: putting people together will produce a wider range of information, insights, and ideas
than when the participants are interviewed separately.
o Snowballing: Bandwagon effect that operates in a group discussion. One person’s comment
triggers a chain of reactions from other respondents
o Stimulation: Once the discussion gets more lively, respondents are increasingly willing to
express their opinions.
o Security: homogenous group makes people feel secure and consequently, they open up and
express their ideas.
o Spontaneity: because respondents are not restricted by structured questions, their responses
are usually more spontaneous and more closely related to their actual feelings.
o Serendipity: Unexpected ideas may arise as respondents ask questions the moderator did not
think of or was reluctant to ask
o Speed: efficiency gains (time and interviewer costs) as a group of respondents are interviewed
simultaneously.
- Disadvantages of focus group
o Misjudgment: the direction to which the discussion evolves may be biased by the moderator.
o Moderation: The quality of the focus group depends to a large extent on the quality and skills
of the moderator. This makes focus groups vulnerable.
o Messiness: Due to its unstructured nature, the coding, analysis, and interpretation of the data
is usually more difficult than with other techniques
o Misrepresentation: The generalizability of focus group to a wider setting must be done with
great care.
o Meeting: arrange the actual meeting
- The number of focus group depends on following factors:
o The extent to which the comparison between types of participants are sought
o The type of participants to be targeted and how well they interact in a discussion
o The geography spread of participants
o The paradigm that underpins the planning, administration, and analysis of the focus group
(should conduct additional discussion until the moderator can anticipate what will be said)
o The time and budget
2.2.In-depth interview
- DEFINITION
o “A depth interviews is an unstructured, direct, personal interview in which a single
respondent is probed by an experienced interviewer to uncover underlying motivations,
beliefs, attitudes and feelings on a topic” (Malhotra and Birks 2007).
o “One-on-one interviews that probe and elicit detailed answers to questions, often using
nondirective techniques to uncover hidden motivations” (McDaniel and Gates 2004).
- In-depth interview helps to overcome:
o Hectic schedules: less constraints play a role when a single person is interviewed, than when
this person needs to participate in a group meeting at a specific time and location.
o Heterogeneity: for some products the context is too specific to form a homogenous group
o Live context: This is the case when the interview comes to the natural habitat of the
interviewee. The decoration of the house or office may reveal a lot of additional information.
o Interviewer reflection: in in depth interviews the interviewer has more opportunities to think
the process through and experiment a bit (if it fails one respondent is lost rather than an
entire group).
- Advantages of in-depth interview:
o Uncover greater depth as the discussion can get more focused
o Attribute: you can connect the answers to the respondent, in focus groups this may be difficult
as people continuously interact with each other.
o Free exchange: an atmosphere of confidence may stimulate the interviewee to speak freely.
For instance, when corporate sensitive information is concerned.
o Being easier to arrange than the focus group
- Challenges when using in-depth interview:
o The lack of structure makes the results susceptible to the interviewer’s influence; and the
quality and completeness of results depend heavily on interviewer’s skills.
o The length of interview, higher costs  few number of in-depth interviews in a project.
o Data obtained can be difficult to analyze and interpret.
- The importance of context in choosing appropriate research design and interviewing technique (in-
depth interview vs. group, mini group vs. standard group, friendship pairs vs stranger groups,
association with projective techniques, digital-based techniques) and in formulating questions.
- RELATIVE DISADVANTAGES COMPARED TO FG’S
o Expensive to conduct
o Less time efficient in terms of data collection and analysis
o No snowballing effects due to interaction
o Individuals may not reveal their opinion as they feel weaker in a one-on-one situation.
o No possible relevant questions brought up by other participants
- HOW MANY INTERVIEWS?
o On average n = 20 is sufficient; but is influenced by
 Homogeneity of the target group
 Specificity of the subject
 If multiple (k) segments exist, k*20 interviews are needed
 Continue until new insights stop to emerge
- HOW MANY INTERVIEWERS?
o Using multiple interviewers reduces subjectivity in the interpretation of findings
 both have some limitations:

+ focus group: how to organize, conducting the focus group , participant have interest in research topic

+ interview: the sample size, need to design your sample (who should be included, how many people -
to realize the situation), guide - how do you do structure of interview (need some ideas, need to write
down the topic, dont try to formulate the questions because want people to talk about it, how feel
question asked, get people's time; guide is the simply list of topics in interest)

LADDERING INTERVIEWS: MEANS-END CHAINS (MEC) – applies structure to qualitative interview

- Theories of consumer behavior act as


- Concrete: Cognitive representation of physical characteristics of an offering. Can be directly
perceived
- Abstract: Abstract meaning representing several more concrete attributes. Subjective, not directly
measurable. Can’t be directly perceived
- Functional: Immediate, tangible consequences of product use. What does the product do? What
functions does it perform.
- Psychosocial: Psychological (how do I feel) and social (how do other feel about me) consequences of
product use
- Instrumental: Preferred modes of behavior, abstract consequences of product use
- End: preferred end-states of being, very abstract consequences of product use
 Zooming in and zooming out; attribute  consequence from attribute  value
 laddering: moving from obvious to less obvious (from attribute  benefits of attribute  how
people value these benefits)  The central question in interview: why
 based on comparisons of the consumer’s choice alternatives with three basic questions on the A-
C-V chain: attributes – what is different about these alternatives; consequences – what does this
difference mean; values – how important is this for you;
 Another technique using structure to qualitative in-depth interview is repertory grid technique
(GRT), based on personal construct psychology, grounds the data in the culture of participants.
 ZMET (Zaltman metaphor elicitation technique) allows the participant to define a frame of
reference for an interview  understand the images and associations.
2.3.Projective technique
- Give access to consumers’ subconscious (feelings)
- They work as follows:
o Participants are asked to project their feelings and thoughts onto other things; participants are
asked to interpret the behavior of others rather than to describe their own behavior 
indirectly project their own motivations, beliefs, attitudes, and feelings. For example: If Coca-
Cola was an animal, which animal would it be?
o Participants are then asked to explain their answers. This ‘why’ question is the most important
part of using projective techniques, as the projective techniques are designed to release the
sub-conscious thought rather than to be, in themselves, revealing. Probing is used to try and
uncover the real explanations. For example, if a Coca-Cola was seen as a cow, the explanation
may be that the respondent sees it as fat, slow moving and uninspiring.
o Projective techniques are fun. They are widely used, with clients, respondents and researchers
all finding them a welcome change from the humdrum of traditional market research
questions.
o Validity is an issue, as with other qualitative techniques
- Projective techniques are classified into association, completion, construction, and expressive:
o Association technique: participants are presented with a stimulus and are asked to response
with the first thing that comes to mind (e.g., word association).
o Completion technique: participants are asked to complete an incomplete stimulus situation
(common techniques are sentence and story completion).
o Construction technique: participants are asked to construct a response in the form of a story,
dialogue or description (e.g., picture response techniques – participants are given a picture and
are asked to tell a story to describe it and cartoon tests – participants are asked to indicate the
dialogue that one cartoon character might make in response to the comment of another
character).
o Expressive technique: participants are presented with a verbal or visual situation and are asked
to relate the feelings and attitudes of other people in the situation (e.g., role playing, third-
person technique, personification technique)
- Brand Personification

Brand personification is a Projective Technique that asks people to think about brands as if they were
people and to describe how the brands would think and feel.

- Advantages:
o May elicit responses that participants would be unwilling or unable to give if they knew the
purpose of study.
o Frames the actions as those of “someone else”.
o Increasing the validity by disguising the purpose of study.
o Helpful when underlying motivations, beliefs, and attitudes that operate at a subconscious
level (not aware of them).
- Disadvantage:
o Unstructured direct technique at a greater level  require experienced interviewers who are
expensive.
o Risk of interpretation bias.
o Some participants may lack self-confidence or the ability to express themselves fully with
some techniques that show unusual behavior (e.g., role playing)
- Projective techniques are used less frequently than other unstructured direct techniques, except
word association  should be used when required info can’t be accurately obtained by direct
questioning, participants find difficult to conceive and express issues, engaging participants in the
subject with an interesting and novel way.

2.4.Ethnography

- Ethnography is a research approach based upon the observation of the customs, habits, and
differences between people in everyday situations. Observation may be direct or indirect (e.g. via
written material)
- Ethnography is about entering repondents’ natural life worlds (home, shopping etc.). This provides a
holistic, natural, and nuanced view.
- Ethnography is a mix of observation and interviewing, but always takes place in the object’s natural
environment (no laboratory settings)
- People are natural habitat
ADVANTAGES
o Many objects simultaneously  efficiency
o Versatile / Many behaviors can be observed
o Usually fast (can ask direct questions)
o Relatively inexpensive

DISADVANTAGES
o Limited generalizability
o Usually limited to overt behavior (need to learn techniques, done in an open way)
o Usually limited insight in the object’s attitudes, motivations etc.
o Ethical concerns
3. Online qualitative research
3.1.ONLINE FOCUS GROUPS
- Participants use the technology of the Internet to approximate the interaction of a face-to-face
focus group. Typically respondents are at different locations as well as the moderator.
- RELATIVE DISADVANTAGES ONLINE FGs
o Lack of group dynamics
o Conservation less fluent and slower (less probing, fewer insights)
o Non-verbal communication cannot be observed
o It is difficult to ascertain who actually participates
o Respondents may be engaged in other activities
- RELATIVE ADVANTAGES ONLINE FGs
o Cheaper
o No geographical boundaries in terms of participants
o No traveling; busy respondents more likely to join
o Instant data recording and data analysis
o Openness due to anonymity (unknown to other people)
3.2.NETHNOGRAPHY
- Netnography, or ethnography on the Internet, is a new qualitative research methodology that
adapts ethnographic research techniques to study the cultures and communities that are emerging
through computer-mediated communications.
- Consumers increasingly turn to computer-mediated communication (Internet) for information to
help them in their purchase decisions.
- Usefulness depends on setting
 Suitable online communities
4. Qualitative research: what to ask?

-
- Fit
- Keep eye on ball
- Iterative
5. Wrap up
- Depending on the marketing research problem qualitative research may be preferred over
quantative research (and vice versa, of course)
- A wide variety of tools is available. Again, there should be a match between research problem/topic
and qualitative research technique employed

Qualitative research Quantitative research


Types of questions Probing (intended to discover Limited probing
the truth)
Flexibility High Low
Answer options Own words Predetermined
Amount of information High Variable
Administrative requirements Skilled interviewer Interviewer of less importance
Types of analysis Subjective, interpretive Statistical
Degree of replicability Low High
Type of research Exploratory Conclusive
- QUALITATIVE RESEARCH: STRENGTH AND WEAKNESSES

- Qualitative research is based on at least 2 intellectual traditions: a set of ideas and associated
methods from the in-depth psychology and motivational research, and from sociology.
- “exploratory vs conclusive” and “qualitative vs. quantitative” are parallel but not identical.
- Not possible to say that one technique is better or worse than other  to choose which, based on
the confidence of marketing decision makers in using technique.
MEASUREMENTS
1. Theory
- A theory is a proposed description, explanation, or model of the manner of interaction of a set of ..
phenomena, capable of predicting future occurrences or observations of the same kind, and capable
of being tested through experiment or otherwise falsified through empirical observation.
- Theories:
o ...in Marketing are (usually) not that theoretical (as opposed to practical)!
o …are better, the more they prohibit (following Popper).
o …are not tautological (saying the same thing twice in different words)  they explain
something.
o …ought to be empirically testable or falsifiable.
o …consist of constructs (concepts, phenomena, variables) and hypotheses (on their interactions
or relationships).
o Popper: the quality of a theory is its falsifiability, or refutability, or testability
2. Hypothesis:
- Usually consists of two parts:
o a condition
o a consequence
- Each of the parts contains a construct
o the independent variable (condition)
o the dependent variable (consequence)
- Examples:
o The higher A, the higher B.
o A leads to B.
3. Construct
- A construct:
o … is ‘‘a conceptual term used to describe a phenomenon of theoretical interest’’
(Edwards/Bagozzi, 2000, pp. 156–157).
o … is quantifiable and directly, or indirectly observable
o An indirectly observable construct is called ‘latent’
o Example: IQ
- Constructs MUST be defined in terms of:
o Object (e.g., somebody is loyal)
o Attribute (loyalty)
o Rater entity (extreme high or low value of loyalty)
- Researchers in marketing usually want to investigate relationships between constructs
o direct causal relationships
 Usually, but not necessarily, a linear effect is meant.
 A is called an exogenous variable; B is called an endogenous variable.
AB
o (fully or partially) mediated (indirect) causal relationships
 A appears statistically to have a direct effect on B. Logically, however, A influences Z, and Z
influences B.
 Z is called a mediator (variable).
 Mediation is called partial if the effect between A and B remains significant after inclusion of
the mediator.

o spurious relationships
 A third variable (here: Z) influences A as well as B.
 Example:
There is a relationship between the temperature of the pavement in Chicago and the
fertility of birds in Norway.
 Via statistics, may see relationship between A and B, but not, it’s because Z influence both A
and B.

o bidirectional (cyclic) causal relationships


 A leads to B, and B leads to A.“
 But: not necessarily at the same time
o unanalyzed relationships
 There is a correlation between A and B.
 Sometimes ignore it because don’t have theory to explain
o moderated causal relationships (interactions)
 The strength and/or direction of the effect of A on B depends on the level of M.
 Here, M is a moderator (variable).
- Linking Theory and Observation
4. Measurement model and structural model

- Indicators never measure exactly the constructs, good research try to minimize errors
- Measurement model: Try to measure different constructs
- Structural model: Show relations between constructs
- Nomenclature:
o Indicators are normally represented as squares. For questionnaire based research, each x
indicator would represent a particular question. 11
o Latent variables are normally drawn as circles or ovals. Latent variables are used to represent
phenomena that cannot be measured directly. Examples would be beliefs, intention, motivation. η1
o Few researchers expect their models to perfectly explain reality. They therefore explicitly model
structural error.
o In the case of error terms, for simplicity, the circle is often left off. ε21

- Multi-item measurement:
o increases reliability and validity of measures
o allows measurement assessment
 measurement error
 reliability
 validity
o two forms of measurement models:
 formative (emerging)
 reflective (latent)
- reflective measurement model:
o direction of causality is from construct to measure
o indicators expected to be correlated
o dropping an indicator from the measurement model does not alter
the meaning of the construct
o takes measurement error into account at the item level
o similar to factor analysis
o typical for consumer research constructs (e.g. attitudes)
- formative measurement model:
o direction of causality is from measure to construct
o no reason to expect indicators to be correlated
o dropping an indicator from the measurement model may alter the
meaning of the construct
 statistical tests of reliability and validity do not make any sense
o based on multiple regression
 beware of multicollinearity!
o typical for success factor research

Selecting a Reflective or a Formative Measurement Model

Reflective Indicators Formative Indicators


o Construct occurrence o Determine the construct values
o Interchangeable o Eliminating indicators means changing the construct meaning
o Highly correlated o Not necessarily correlated
o Explicit consideration of measurement errors o No measurement errors
o Several quality criteria o Validity almost not testable
o Connected to construct by factor loadings o Connected to construct by regression coefficients
o Identical antecedents and consequences o Possibly differing antecedents and consequences
 Primarily applied for testing complex causal  Primarily applied for estimating consequences of single steps
relationships

MEASUREMENT
1. Introduction
Measurement
- In data collection stage
- Measurement is crucial for all research designs!
- In research, measurement consists of assigning numbers
to empirical events (that is, properties, attributes or
characteristics of an object) in compliance with a set of
rules.
- Basically, measurement is a 3-part process consisting of:
o 1) Selecting empirical events: that is decide on what
property/characteristic/attribute you want to measure
(gender, preference, perceptions, height, income)
o 2) Developing a set of mapping rules or a scheme for
assigning numbers or symbols to represent aspects of
the event being measured.
o 3) Apply the mapping rule(s) to each observation of the
empirical event
- Illustrate with example of last meeting’s in-class assignments. To make the distinction between
research questions and the measurement questions. Also link to “detailed” figure of problem
definition.

Scaling
- Is an extension of measurement
- Creating a continuum upon which the measured object is located
- The terms measurement and scaling are often used interchangeably

2. Scaling
2.1. Primary levels of measurement scales
- Respondents’ answers can be recorded in many different ways, but one of the
four primary scaling levels always applies.
- Nominal scale (e.g., gender, nationality):
o Nominal scales do not reflect the amount of an attribute
o Few analytical possibilities: frequencies and percentages
o Classification:
 Labels for classes or categories
 Mutually exclusive and collectively exhaustive (not overlaps, but when
integrating, it shows the whole population)
 Same class is same number
 No two classes with the same number
o Identification
 Each number is uniquely assigned to an object
 Each object has only one number assigned to it
- Ordinal scale (e.g., ranking, preference)
o Basic property is ranking
o Numbers should be interpreted relative (more than/less than)
o In practice used for assessing preferences
o More analytical possibilities than the nominal scale
- Ratio scale (e.g., money, time, weight)
o Quantity of an attribute, differences in quantities, ratios
o Fixed zero point
o Typical marketing examples: income, revenues, market share
o All statistics are possible
- Interval scale
o Degree of attribute, difference in degree
o Arbitrary zero point (as a result, no ratio) (not based on a reason)
o Typically used in marketing research to measure attitudes, feelings, intentions,…
o Nearly all statistical tests are possible

 Thus, if you are asked to determine which level of measurement you are dealing with, start with
nominal and see whether the property of the next level is present or not!

- The measurement level/primary scale level is determined by the answering format. Not the
question itself.
2.2.Different scale types

Comparative scaling Non-comparative scaling

Direct comparison among objects Yes No


Measurement level Non-metric Metric
Data properties Ordinal Interval
- Scales mostly used in market research

Paired comparison: with n brands, have n(n-1)/2 paired comparisons

Constant sum

- Respondents are asked to allocate a constant sum of points (usually 100) among a set of objects or
attributes according to some criterion.
o Unimportant objects/attributes receive value 0
o Equally important objects/attributes receive equal value
o Twice as important = double amount of points
o Restricted to number of objects/attributes
o Use with restricted number of objects/attributes
- Although constant sum scaling is comparative in nature, it has some characteristics of metric scaling.
o Frequencies, percentages, and averages
o Rank order
o Distances
- Constant sum should be considered an ordinal scale because of its comparative nature and lack of
generalizability.

Itemized rating scale: Scale consists of various categories marked with a number and/or a brief
description
- Likert scale
o Respondent expresses (dis)agreement with a series of statements
o Originally 5 points (7,9,11 points also possible)
o Ease of use is an advantage
o Effort asked from respondent is a relative disadvantage
 High points (9,10): to have more precise, more details; disadvantage: time more time to think
and give answers (take more energy, effort to answer)
o Analysis per statement
o Summed scores
o If 5 or more scale points are used, the data is considered interval (metric)
o This opens up many possibilities for statistical analysis
- Semantic differential: Itemized rating scale in which the anchors (extreme values) are associated
with bipolar labels that have semantic meaning.
o Characteristics rather similar to Likert scale
o Originally 7 points (5,9, 11 points also possible)
o Difficult when the respondent cannot see the scale (e.g., phone)
o Not always possible to come up with adjectives
o Mostly used for promotional studies and NPD perceptions
o Analysis per statement
o Summed scores
o If 5 or more scale points are used, the data is considered interval (metric)
o This opens up many possibilities for statistical analysis

CREATING ITEMIZED RATING SCALES

- NUMBER OF CATEGORIES
o Trade-off between discrimination and required effort
o General guideline: between 5 and 9 categories
o Factors that influence the decision on number of categories:
 Involvement and knowledge about object
 Nature of object (need fine discrimination or not)
 Mode of data collection (e.g., telephone interviews restrict number of categories)
 Type of statistical analysis
o Although it is possible to use an even number of categories this in typically not
recommended
- (UN)BALANCED:
o (Un)balanced scale: (un)equal number of favourable and unfavorable scale categories
o In general, balanced scales are used
- (NON)FORCED
o Forced scale: respondent is forced to express his opinion; there is no “no opinion”/”no
knowledge” alternative provided
o Nonforced scale does include such a “no opinion” alternative
o If it is to be expected that a significant proportion of the sample will have no opinion or is
unwilling to give their opinion, a non-forced scale will yield better data
o When using a nonforced scale be careful when analyzing the data
- DESCRIPTIONS
o Many possibilities exist for describing the scale points
o Typically we use numbers and a description for the anchors (and midpoint)
o Relative to the all-verbal descriptions this does not influence data accuracy
o Description anchors does influence data distribution
o Use strong anchors like “Strongly (dis)agree” or “Very (un)likely”
3. Measurement
- Measurement is about asking the right questions

3.1.Measuring abstract properties


- These are all relatively objective, easy to measure properties
- It’s the subjective properties that are particularly relevant for marketeers
- Besides measuring an object’s objective properties (e.g., age, gender, horsepower, color),
marketing research is also often interested an object’s subjective properties (e.g. attitudes,
intentions, feelings, perceptions).
- Subjective properties are abstract, intangible characteristics that cannot directly be measured
because they are mental aspects a person attaches to an object. These subjective properties are
referred to as latent constructs.
o Service quality
o Brand/Customer loyalty
o Brand image
o Attitudes (e.g., satisfaction, trust)
o Repurchase intentions
o Intentions to recommend company to others
o Personality traits of consumers (e.g., risk taking, self efficacy)
o Perceptions regarding an employee’s customer-oriented behavior
- Measuring constructs: asking the right questions (verbal or written format) to take a look into
someone’s mind.
- Constructs are measured indirectly via developing a so-called set of observables or indicators. These
are identifiable and measurable components which are associated with the particular construct.
o Indicators: Questions we ask the respondents
- In fact, there are two construct types (measurement models): reflective and formative constructs
- Relation between construct and indicator:
Reflective Formative
From construct to indicator From indicator to construct
Change in indicator does not lead to change in Change in indicator does lead to change in
construct construct
Change in construct leads to change in indicator Change in construct does not lead to change in
indicator
Indicator is manifestation of construct Indicator defines construct
- Relation between indicators

Reflective Formative

Indicators are interchangeable Indicators are not necessarily interchangeable

High correlations among indicators No requirements regarding the correlations among indicators

Domain sampling Census of items

3.2. Characteristics of sound measurement


- The development of multiple item construct measurement instruments differs as a function of
construct type
- In practice
o Developing multi item scales to measure constructs is not easy and requires a lot of statistics
(especially for reflective scales).
o Luckily, for many (reflective) construct scales are readily available either in articles or books
- True score model
o XO = XT + XS +XR
 XO = Observed score
 XT = True score
 XS = Systematic error (when have mistake in formula, in data)
 XR = Random error (have mistake due to psychology, …, can not control)
o Ideally, XO = XT!
Reliability and validity

- Reliability and validity are important for all measurements (both objective and subjective properties)
- The specific types of reliability and validity differ per type of measurement instrument

Reliability

- Extent to which a scale produces consistent results if repeated measurements are made (you
compare multiple measurement on a single object)

- It is the degree to which the measurement results are free of random error (XR)
- A measure is perfectly reliable when XR=0
- Systematic error (XS) does not have an impact on the reliability
- XO = Observed score; XT = True score; XS = Systematic error; XR = Random error
- Reliability is assessed by determining the (co)variation present in the different measurements
occasions
- Thus, data needs to be collected and analyzed to draw conclusions about a measurement
instrument’s reliability
- Different types of reliability
o Test-retest reliability
 Each respondent is measured at two
different times under as equivalent
conditions possible
 the degree of correlation between the two measurements reflects the reliability of the
measurement instrument: Higher correlation = higher reliability
 DRAWBACKS:
 Sensitive to time interval
 Not always possible
 Carry-over effects (where the evaluation of a particular scaled item significantly
affects the participant’s judgement of subsequent scaled items)
 Influence of external sources
o Alternative forms reliability
 Two equivalent measurement instruments (content wise) of a certain characteristic are
designed
 Every respondent is measured by the two instruments
 The correlation between the respondent scores obtained by the two instruments is
indicative of the degree of reliability
 DRAWBACKS:
 Difficult and/or time consuming to construct two equivalent instruments
 Low correlation: low reliability or unequivalent instruments?
o Internal consistency / reliability
 Only applicable for reflective multi-item measurement instruments
 Analysis per construct
 As the indicators all reflect the same underlying construct, they should produce
consistent results
 IC reliability assesses the extent to which the various indicators lead to consistent results
 Using the collected data, two IC reliability measures can be computed (split-half and
coefficient alpha)
 split-half reliability
 item scores are split into two groups (e.g., 1 construct is measured by 4 items 
split into 2 groups, 2 items/ group)
 For each group of items the summed score is calculated
 The correlation between the two summed scores is the split half reliability
coefficient for internal consistency
 DRAWBACKS: Correlation coefficient is influenced by the way the items are split into
groups
 Even numbered items vs odd numbered items
 First x items vs last x items
 Much more possibilities (to split into groups)
 Coefficient alpha
 Average of all possible split-half correlation coefficients
 Thus, only for multiple item measurements
 Cut-off value 0.60 (possible range 0-1)
 Done with statistical software
 Also often referred to as Cronbach’s alpha

Different types of validity


- VALIDITY OF A MEASUREMENT INSTRUMENT
o Are you really measuring what you intend to measure (so
nothing else but the characteristic you are interested in)
o The differences in scores reflect true differences
o Perfect validity: the measurement is free of error. Thus:
Xo = XT and XR = 0, XS = 0 (# reliability, XR=0, XS doesn’t
significantly impact reliability)
 XO = Observed score; XT = True score; XS =
Systematic error; XR = Random error
o Various types of validity (content, criterion, and construct
validity)
- CONTENT VALIDITY
o Subjective yet systematic evaluation of how well the content
of the measurement instrument covers the phenomenon it
intends to cover
- CRITERION VALIDITY (outcome variable)
o Extent to which the measurement instrument performs as expected in relation to other variables
(criterion variables) as meaningful criteria
o Relevant in cases when you want to use a measure as a proxy (“predictor”) for something that is
very difficult to measure (or impossible to measure at the current time)
o Two forms: concurrent and predictive validity
 Concurrent validity: measurement instrument should be able to predict a criterion indicative
of a current situation (not in the future)
 Predictive validity: measurement instrument should be able to well explain a
criterion/outcome in the future
- CONSTRUCT VALIDITY
o Relevant in the measurement of abstract properties (“constructs”)
o Does the measurement instrument (i.e., the items) indeed measure the construct it should
measure?
o It can be derived by examining a construct’s
 Convergent validity
 Discriminant validity
 Nomological validity
o Construct validity is tested by looking at the collected data by means of statistical analyses
o Assessing construct validity
 We measure constructs for a reason
 Typically a construct is linked to other constructs to understand a phenomenon (based on
theory or results exploratory research)
 This set of interconstruct relationships is referred to as a nomological network
 Data is collected on each relevant construct in the network
 Statistically analyzing the relationships between the constructs provides insight into the
construct’s validity
o Nomological and discriminant validity
 All constructs are measured using multiple items rated on a Likert scale
 Nomological validity: all the relationships as expected (significance/sign)
 Discriminant validity: less than perfect correlation between constructs
 For all constructs, measurement instruments need to be developed. Which are subsequently used
to collected data. This data can be used to estimate the relationships among the constructs. This
most simple way would be to calculate correlation coefficients. If these correlation coefficients are
significant and with the expected sign we find support for nomological validity. But there are also
some restrictions that apply to the magnitude of these correlations. We do not like perfect
correlations, because a perfect correlation means that the measurement instruments lack
discriminant validity.
o WITHIN-METHOD CONVERGENT VALIDITY
 A particular form of convergent validity that is relevant in the measurement of constructs
 Only applies to reflective multiple item measurement instruments
 Convergent validity: item scores are highly correlated and together explain more than 50%
of the construct’s variance
 You can consider the different items as multiple measurement attempts of the same underlying
construct. If all the items indeed adequately measure the underlying construct
 They should be highly correlated. These high inter item correlations are a sign of within-method
convergent validity. Note that it is closely related to internal consistency reliability.
 Is perfect correlation a problem here in terms of discriminant validity? No, it just means you could
delete an item. (correlation between items # correlation between constructs – discriminant validity)
RELIABILITY AND VALIDITY
Reliability: XR = 0; thus X0 = XT + XS
Validity: XS = 0 and XR = 0; thus X0 = XT
 If a measurement instrument is reliable it is not necessary valid
 If a measurement instrument is valid it is always reliable
XO = Observed score
XT = True score
XS = Systematic error
XR = Random error
To choose a scale technique, should consider:
- The level of info desired
- The willingness and capabilities of participants
- Characteristics of stimulus objects
- Method of administration
- The context and cost
Wrap up:
- “Scire est mensurare” (knowing is measure) (Johannes Kepler). He was 100% right.
- Or more tailored to business economics “you can’t manage, what you can’t measure”
(Peter Drucker).
- In marketing, we need to measure objective and subjective properties (Similar to Kepler
who did pure mathematics and astrology).
- Measurement is asking the respondent the right questions
- Regardless of the properties being measured, the measurement instruments should be
reliable and valid to be useful
- Closely related to measurement is scaling, which provides us with answering formats for
the measurement instrument we use
- We discern between nominal, ordinal, interval, and ratio measurement. In the above
order, properties, information, and analytical possibilities increase.
- In practice, itemized rating scales (Likert and Semantic differential) with 5 or 7
categories are used  Interval data

SURVEYING AND SURVEY DESIGN


1. Introduction
- Qualitative vs. quantitative research
- At various levels, marketing research is about asking questions and getting answers
2. Surveying
- Standardized data collection process
- Structured questionnaire
- Sample of respondents
- Advantages and disadvantages: Note that these (dis)advantages also depend on the
type of survey method employed.
o Advantages:
 Large samples
 Distinguish between small differences
 Easy to administer
 Consistent data
 Tap abstract properties
 Advanced stats
o Disadvantages:
 Questionnaire design is difficult
 Level of depth / Richness information
 Low response rates
 Entire process costs much time
- Survey methods:
- Survey method selection:

 Trade-off pros and cons


 Select what best fits your needs
 Combining methods may be an option
3. Questionnaire design
- WHAT MAKES AN EXCELLENT QUESTIONNAIRE?
o Translates the needed info into questions respondent can and will answer
o Motivates respondent to cooperate
o Minimizes response error
- OVERVIEW QUESTIONNAIRE DESIGN PROCESS
o Specify information needed:

o Specify survey method: telephone, personal interview, technology,…


o Content individual questions
 If the question is relevant, keep it
 If the question is not relevant, get rid of it
 A few exceptions to the above rule include
 Demographics
 Neutral questions to build rapport
 Questions to disguise purpose or sponsorship
 Make it as easy as possible for the respondent
 A respondent must be willing and able to answer the questions
 Some issues to pay attention to:
 Filter questions
 Aided recall
 Do the maths yourself
o Question wording
 Define the issue (who, what, when, where) (e.g., Which brand of soft drink did you personally
drink most at the university during last month?)
 Use unambiguous words
 Avoid the use of leading/biasing questions (e.g. What is the most beautiful SUV at the
moment? I/O Is the BMW X5 the most beautiful SUV at the moment?)
 Avoid double-barreled questions (e.g., Do you think Aldi is cheap and well-equipped?)
 Avoid implicit assumptions (Are you in favor of green energy if it would lead to an energy bill
that is approximately 10 higher? I/O Are you in favor of green energy?)
 QUESTION WORDING-RESPONDENT’S ATTENTION
Use positive and negative statements
 A tool to check whether respondents filled out a questionnaire seriously
 Often done with multiple-item scales tapping constructs (i.e., abstract properties)
 Drawback: validity issues

Original “role ambiguity” scale, e.g.:

 I do not receive sufficient information from my supervisor concerning what I am


supposed to do in my job
 I often feel that I do not understand what is expected from me in my job
 Questions are answered on 9-point Likert scale, with 1 = totally disagree and 9 = totally
agree.
 Better solution: JUST ASK THEM AND YOU WILL FIND OUT “This is a question to check whether you
are paying attention. Please enter a 5 if you read this”  Hide this question somewhere among the
other questions
 Other ways: check the time of response, and check whether they answered the same for all
questions
o QUESTION STRUCTURE / MEASUREMENT SCALE
 Unstructured questions / Open questions
 Respondents are asked to answer in their own words
 Susceptible to interviewer bias
 Difficult to code
 Particularly suited for more exploratory oriented research
 Often-used open-questions
 Simple questions like “what is your occupation”
 Possibility for the respondent to make some remark on the topic (usually at the very end)
 Structured questions: Question in which the set of response alternatives and response format are
pre-specified. Structured questions may be
 Multiple choice
 Dichotomous (contrast between 2 groups)
 Comparative
 Non-comparative (= itemized rating scales lecture III)
 Bad/good choice: Data requirement for your analysis (statistical techniques involves data
requirements in terms of nominal, ordinal, interval, and ratio)
 Data requirements for answering your research questions
 Within the data requirements, the effort placed on the respondent should be as minimal as possible.
 Likert scales with 5 to 9 point are most commonly used
 Scale points should be logical, meaningful and mutually exclusive.
 For sensitive issues, like income, ordinal scales are preferred to ratio scales
o ORDER OF QUESTIONS: flowerpot approach

ORDER OF QUESTIONS-ALTERNATIVE APPROACH

ORDER OF QUESTIONS-BRANCHING QUESTIONS

 Branching question: “if yes, go to question 4; if no, go to question 9”


 Minimize in paper-and-pencil self-administered questionnaire
 If a lot of branching questions are unavoidable opt for another survey approach
 Always make sure that the questions are ordered in a logical manner
o THE FINAL STAGES
 The entire package = questionnaire + letter + explanation
 Everything should look great (font-type, colors etc.)
 Pre-test the questionnaire (and the remainder of survey package) using a small sample and ask
for comments (e.g., unclarities, difficulties)
 Make adjustments if necessary
 If the pre-tested version is judged adequate, you arrive at the final questionnaire

Wrap up:

- Surveying is probably the most used marketing research technique


- Various survey methods exist, each with their pros and cons.
- Which survey method to use?  Matter of fit between research situation and survey method
characteristics
- KEY PRINCIPLES IN DESIGNING A SURVEY
o Effort
o Structure
o Remember to cover all relevant pieces of info
o Reliability and valid
o Keep it simple for the respondents

SAMPLING AND HYPOTHESIS TESTING


1. Introduction
- Sampling: choosing a sample that will allow you to generalize your findings to a population
- Sampling is part of step 4 in the marketing research process (Data collection), and it precedes the
actual data collection.
- Hypothesis testing is part of step 5 (Data analysis). You can only do it, of course, after you have
collected and described your data.
2. Sampling
- Some basis terminology:
o The target population (this is the set of individuals you would like to make a statement about).
For example: ‘male adults between the ages of 20 and 25’; Belgian citizens; luxury consumers;
etc. The customer insight you develop in your study is always about this group of individuals.
o The elements in your target population, are the individuals that fit with your description.
o The sample is the part of the population that you will investigate in your study. You may, for
example, have a sample containing 25 elements from your total population. Sample size in that
case is: 25
o If you use a census sampling strategy, all elements of the population must be included in your
sample. Sample = population.
- The idea behind sampling is that it is not necessary to investigate every single member of the
population to know how the population behaves. If you select a representative sample, and
investigate that sample, you may infer features of the population based on the sample.
- The idea is to generalize findings about your (representative!) sample to the population.
- Overview sampling

oAll sampling starts with a definition of the target population: I am going to develop insight in
the attitude, behavior of... (Belgian citizens; girls between the ages of 10-15; male adults; Ford
drivers; etc.)
o Then you need to either describe the population, or list all of its members, which helps you to
determine the ‘sampling frame’.
o Then you choose how you will select your sample from the population.
o You need to determine how many elements your sample needs to contain (sample size).
o Then you collect your data from the sample.
2.1.Defining target population
- A good definition of the target population should contain information about
o The sampling elements (person)
o The sampling units (individual, couple, group)
o The area of coverage (extent, geographical, period, etc.)
o Time (when)
2.2.Sampling frame
- Representation of the elements of the target population
o List
o Set of directions (description)
- Sampling frame error: when the sampling frame contains more or less of a specific type of
individual than its proportion in the target population (more males, more females, more elderly
people, more Ford drivers etc.)
o Ignore  We can choose to ignore the problem (if we think it is not affecting the results)
o Screen  We can also screen the sample (post hoc determination of the sample composition:
report frequencies)
o Correct  Sometimes we can also correct it (delete the disproportionate parts – too large or too
small - so that we end up with a perfectly representative sample)
 The sampling frame allows you to select your sample.
 This can be a listing of individual elements representative of the target population, or a description.
2.3.Sampling techniques

- Sampling strategies can be divided into two major techniques: probability and non-probability
sampling.
o Non-probability sampling means that the probability of an element to be included in the sample
is not the same for each element (e.g., I only use a list of students from Hasselt to sample
university students).
o Probability sampling means that the probability of an element to be included in the sample is
the same for each element (I use a list of all students to sample from).
- Probability sampling is generally considered to lead to superior results, but non-probability
sampling is more common (for reasons of efficiency):
o Convenience sampling: I select from elements that are in my neighborhood, that I can reach
easily.
o Judgmental sampling: I pick the elements that I judge most appropriate.
o Quota: I only use a limited number (or percentage), and stop when I have reached that
number (2 stages: the first stage consists of developing control categories or quotas of
population elements, the second stage selects sample elements by convenience or
judgement).
o Snowball sampling: I ask respondents to invite their friends to respond to my survey
- PROBABILITY SAMPLING
o Objective is to select a representative, unbiased sample
o Sampling procedure wherein each element has a known, fixed, (but not necessarily equal)
probabilistic, chance of being selected for the sample.
o Researcher specifies an objective procedure; selection of elements is independent of
researcher
 Precise definition of population
 Precise definition of sampling frame
o Simple random sampling: I ‘pick without watching’. As you can see, this may lead to a
misrepresentation of one element
o COMPARING PROBABILITY SAMPLING TECHNIQUES
 Probability sampling techniques vary in terms of sampling efficiency (tradeoff between
precision and costs).
 Optimize sampling efficiency; maximize level of precision subject to the budget
constraints
 Efficiency of probability sampling techniques is assessed by comparing it with that of
Simple Random Sampling (SRS)
o Systematic Sampling is not random sampling, but sampling according to a specific rule (I select two
columns, and all members of a column).
o Stratified sampling is an attempt to make the sample random and representative. Make sure that
all categories are included in the ’right proportion’ (but members from each category have the
same chance to be included).
 This happens when we really want to make reliable predictions about a complex
population.
 Increasing efficiency by increasing precision
 Two stages
 Mutually exclusive and collectively exhaustive strata
 From each stratum, respondents by SRS
 The number of stratum should not larger than 6 (increasing precision with increased cost)
 Ensuring homogeneity between elements in one strata, heterogeneity between elements
of different stratum, stratification variables should be close to characteristics of interest,
decreasing cost with easiness in measure
o Cluster sampling is probabilistic, but more efficient than random sampling.
 Increasing efficiency by decreasing cost
 Stages:
 Mutually exclusive and collectively exhaustive clusters (each cluster contains diversity
of participants in the target population)
 Randomly select clusters
 From selected clusters, sample all elements (one-stage cluster sampling) or randomly
sample subset of elements (two-stage cluster)
2.4.Sampling size
- SAMPLING SIZE DETERMINATION-THEORETICALLY
 We can theoretically determine the necessary size of a sample by the following formula.
However, in practice this doesn’t work, because we generally do not know the variance in the
population (we first need to establish it in a random sample) and it works only in the case of
Simple Random Sampling.
- Qualitative factors should be taken into account when deciding sample size:
o The importance of decision
o The nature of research
o The number of variables
o The nature of analysis
o Sample sizes of similar researches
o Incidence rate (occurrences of behavior or characteristics in population)
o Completion rate
o Resource constraints.
- SAMPLING SIZE DETERMINATION-PRACTICALLY
o The formula looks very handy, but
 Only applies to SRS
 Population parameter (σ) not readily available.
 Conduct a pilot study
 Use secondary data (e.g., previous studies on the topic)
 Judgment of an expert
 Are based on normal distributions
o Often rules of thumb or experience is used to determine sample size.
 for regression: 20 observations per variable
 for an ANOVA or t-test: 20 observations per group
o for the same reliability as the simple random sampling, sample size are the same for
systematic sampling, smaller for the stratified sampling, and larger for cluster sampling.
2.5.handling non-response bias
- The more the people-who-do-not-respond differ from the people-who-respond the bigger the
problem
o Non response bias (the fact that not every invited respondent fills out your questionnaire) can
be a problem. Especially when specific groups of respondents refuse to answer. For example
‘highly frustrated customers’ refuse to fill in a satisfaction survey: you get a very positive bias.
o Or when lazy respondents do not respond... etc. So, in general we do our best to make sure
that as many respondents from as many as possible categories reply.
- What you can do to increase response rates:
o Incentives
o Reminder
o Designing a motivating research instrument
- Assessing the extent of the non-response problem
o Compare profile of respondents and non-respondents
o Contact the non-respondents in a different way
o Use the people who responded late as a proxy for the non-respondents
 Often we try to compare the respondents to non-respondents. How: well, we know that late
respondents resemble non-respondents, so we compare the last 10% of respondents with the
first 10% (t-test, compare the means: if you find significant difference, there is a problem...).
3. Hypothesis testing
- The idea behind hypothesis testing is simple. You have an idea about a population and you test
whether the data you collected from a sample of that population statistically supports that idea.

- Overview hypothesis testing procedure

3.1.Formulate hypothesis
- RESEARCH QUESTION -> Content-based hypotheses -> Statistical hypotheses
 hypothesis is the answer to research question  look back at your research questions, and
provide the ‘answer’ that is most logical, and for which you have good arguments.
For example: RQ: What is the effect of quality on satisfaction. Hypothesis: There is a positive
relationship between quality and satisfaction  Then translate it into a statistically testable
formulation: there is a positive correlation between quality and satisfaction.
WHAT IS YOUR HYPOTHESES?

- Formulate null and alternative hypotheses:


o A good set of hypotheses is:
 Stated in a declarative form (i.e., not a question)
 Posits a relationship between variables
 Reflects an underlying research problem/theory
 Brief and to the point
 Testable
o Set of hypotheses: H(0) + H(a)
 We generally test the null-hypothesis (the hypothesis stating that there is no effect) and not the
alternative hypothesis (the hypothesis stating that there is an effect).
o THE ALTERNATIVE HYPOTHESIS: H(a)
 The alternative hypothesis assumes an effect/inequality
o THE NULL HYPOTHESIS: H(0)
 For every H(a) there is an H(0)
 Assumes status quo / no effect
 Should contain a statement that things are equal and/or unrelated to each other
 H(0) and H(a) are opposites, but together they cover all possible outcomes
 Statistically seen, this is the hypothesis that you test
- ONE-SIDED AND TWO-SIDED HYPOTHESES
o Under the assumption that you use a test statistic with a symmetric distribution, you have
three possibilities
o Two sided hypotheses
 H0: no difference
 H1: difference
o One sided hypotheses (RSH)
 H0: equal to or smaller than (=<)
 H1: larger than
o One sided hypotheses (LSH)
 H0: equal to or larger than (>=)
 H1: smaller than
 The simplest (symmetrically distributed) test statistics (e.g., the often used t-statistic) can
decide about: there is a difference between values, or there is no difference between values.
 We can also use single sided hypotheses: (larger than, or smaller than)

 These are the distributions of test statistics we use a lot: T-test (normally distributed), Chi-
square test, F-test.
3.2.Select appropriate statistical test
Choose the right STATISTISTICAL TEST

- The test you choose depends on : the purpose you use it for:
o differences in value (t-test, ANOVA, etc.)
o differences in frequency, fit (Chi-square)
o differences in effect size (regression), etc.)
 Keep in mind that your data need to fit the test-statistic (or the other way round, of course)
- Regardless of the statistical test used, the remainder of the hypothesis testing
procedure stays the same.
- The value for the test statistic is based on your sample data
- This value will be compared to some theoretical standard (distribution)
3.3.Choose level of significance
- Usually, we use p = 0.05
- It represents the probability of committing a Type I error
- Type I error: reject the Ho when it is actually true
- Confidence level is 1 - p ; the probability that you fail to reject the H0 when you indeed should not
reject it (i.e. H0 is true)
- Thus, the significance level represents the degree of risk that you are willing to take in rejecting H0
when it is actually true
 Choose a level of significance (before you do any test!!!). This represents the probability of
making an error that you will accept. It depends on how certain you need to be about the
acceptance or rejection of your hypothesis.
- BECAUSE THERE IS ALSO THE PROBABILITY OF MAKING A TYPE II ERROR

 Sometimes it is worse to accept an untrue hypothesis than to reject a true one. In that case we
need to decide conscientiously about the acceptable error levels
We do not want to send an innocent person into jail!
We do not want to set a guilty person free.
 Although beta is unknown, it is related to alpha. An extremely low value of alpha (e.g. 0.001)
will result in intolerably high beta errors  So it is necessary to balance the two types of
errors.  As a compromise, a is often set at 0.05; sometimes it is 0.01
3.4.Calculate statistic and its p-value
- Hypothesis testing involves comparing our data to some known or given standard.
- This standard is typically a so-called distribution (t-distribution, F-distribution etc.)
- To achieve this, the data needs to be transformed into a value that is related to this distribution
(data analysis)
o Conducting a t-test -> gives you a sample t-value
o Conducting an F-test -> gives you a sample F-value
o And many more possibilities
- SPSS does all the calculations, but it is not ‘idiotproof’!
o So, based on the appropriate test I need to compare my sample value of the test statistic to
the known distribution
 And it will tell how ‘much’ your sample value of the test statistic differs from the test statistic.

EXECUTE THE STATISTICAL TEST


- To test a hypothesis you need to know the critical value of every point of the distribution
- So from the t-distribution, F-distribution, Chi-square distribution…………..
o The critical value is the value of the test statistic that corresponds to the 5% (or 10%) error
probability.
o For example in a t-test, the critical value (two-sided hypothesis) of t is 1,96 (5%). If the value
for your sample is lower than this critical value (p>5%), your Ho (Null Hypothesis) is accepted!
o If it is higher (p<5%), your Ha (Alternative Hypothesis) is accepted, and there is a significant
difference between the two tested values (e.g., the Satisfaction means of group A and
Satisfaction mean score of Group B).
3.5.Testing null hypothesis
- Most statistical tests provide a lot of output
- In terms of testing hypotheses one value is essential: the p-value  The p-value indicates whether
the statistic is significantly, Significance depends on how you chose alpha (0,05, 0,10, 0,001, etc.)
higher or lower than the sample value
- Compare the p-value of your test statistic with the significance level
- Based on the outcome of this comparison
o Reject Ho
o Do not reject H0
- The appropriate p-value depends on whether you have one-sided or two-sided tests
- One-sided/two-sided testing is only relevant for test-statistics based on a symmetric distribution.
- The p-value says something about the credibility of your null hypothesis!
- Significance level:
o Standard 5%
o Means that with 95% certainty the null hypothesis is rejected when it should be rejected
o Exceptionally 10%, for example when it is really difficult to obtain a large enough sample
(your population is CEOs of Cola producers).
- Statistical software packages only provide 2-sided p-values.
- To obtain the one-sided p-value
o p/2 (where p = the two sided p-value)
o 1 – (p/2) (where p = the two sided p-value)
- Which one-sided p-value is appropriate depends on the relationship assumed under H0 and the
value of the sample statistics
o If in line, then use p/2
o If in conflict then use 1- (p/2)
 Exactly: p<0,05 (and alpha = 0,05) then we reject the Ho. Your Ha is accepted (supported by the
data!).
 EXAMPLE: THE APPROPRIATE ONE-SIDED P-VALUE:

Suppose we want to test


H0: Mean =< 4
H1: Mean > 4
SPSS shows the following results: Mean = 3.46; t = 2.202 (p = 0.029)
IN PRACTICE:
Act as if you are testing a two-sided hypothesis
H(0): equality
H(1): no equality
If we reject H(0), we subsequently interpret it as if it were a one-sided test
H0: Mean equal to 4
H1: Mean not equal to 4
SPSS shows the following results: Mean = 3.46; t = 2.202 (p = 0.029) -> Reject H0  We find that the
buying intentions are significantly different from the scale midpoint. In fact, we find that the buying
intentions are smaller than 4, namely 3,46

Wrap up
- Sampling
o From whom do you get the needed information
o Sampling is important, as a bad sample yields bad population estimates (especially in the case of
quantitative research)
- Hypothesis testing
o Putting your thoughts to the test – what do the data say about them?
o Hypothesis testing is done using statistical tests
o Regardless of the statistical test used, the procedure stays the same: comparing the p-value of
the sample statistic to the alpha level
- Nominal: mode
- Ordinal: mode, median
- Interval, ratio: mode, median, mean.

DATA ANALYSIS FOR MARKETING RESEARCH


1. Introduction
- Data analysis…
o Means statistics
o Is important for decision-makers
o Is useful for decision-makers
o Is a means to an end
 In Data Analysis we use several statistical techniques, most of which are available in Excel, SPSS,
Stata, etc. These packages differ somewhat in the user interface, but the results are very
similar.
- Data analysis or number crunching
o Data can be virtually anything
o Typical marketing data: Attitudes, preferences, intentions, sales,income,
consumer lifestyle values, demographics
o For the data analysis techniques discussed in this course, the only requirement
is that you can put a number on it
o Data means variables that have been measured (through your questionnaire,
or other types of observations). In a table, these are the columns.
o Data means observations (respondents answering the questionnaire). In a table,
these are the rows.
2. Data preparation
- Data often needs to be prepared for the analysis
- Furthermore, data sometimes needs to be cleaned (unusable observations need to be
removed, missing data (missing responses) may need to be filled in with a substitute
(the means of all other observations, etc.).
- the different ways in which the data set should be prepared for the analysis:
o check
o correct
o clean
o adjust
o codebook
- coding:
o Assigning a code to represent a specific response to a specific questions (Label,
Values, Missing values)
o Codebook (“variable view”) in SPSS
o In SPSS in ‘Variable View’ mode, you can see (and adjust) an number of
characteristics of the variables.
o There is information about the variable: its Name, the type, the width, the
number of decimals, Label (the item on your scale) and what SPSS should do
with missing variables and how they are coded
3. Data analysis strategy
- Once your data has been prepared (I always visually screen the data, so that it becomes
easy to see if there is something wrong with an observation) you must decide which
types of analyses you will execute, and in which order
- The appropriate data analysis strategy
o Is about finding solutions
o The research problem
o Data properties
o Statistical technique properties
 this depends on what problem you are trying to solve. So keep the reseach questions in mind,
when you design a data analysis strategy
- Statistical methods:
o Descriptive techniques
o Univariate techniques (single variable)
o Multivariate dependence techniques (multiple variables)
o Multivariate interdependence techniques (multiple variables, structural
equation modelling)
4. Descriptive technique
- to describing the data structure and the sample, we also need descriptive statistics to
establish whether the data are fit for the most frequent types of analysis
- function:
o Describe basic features of the data
o Summary of sample and measurement characteristics
- Distribution:
o Every variable has a distribution
o Frequencies
o Always possible
- Describe distribution:
o Central tendency
o Measures of dispersion
- FREQUENCY DISTRIBUTIONS
o Objective: obtain count of number of responses associated with different values
of the variable
o Counts, percentages, and cumulative percentages
o Table and graphical format possible (normal curve)
o OTHER REASONS FREQUENCY DISTRIBUTIONS ARE USEFUL: may notice a strange
problem and also notice how many people did not fill in this questions.
o SPSS: Analyze > Descriptive statistics > Frequencies (enter all the variables you
wish to include in the analysis in the right window… Then indicate which tests
you want to do, and how you wish to receive the output (visual, text etc.))
- EVERYTHING HAS A DISTRIBUTION
Distributions can be sum-marized in a couple of numbers:
o Measures of central tendency
o Measures of dispersion
 Important characteristic of each variable: its distribution (of values, mostly around the Mean)
 Normal distribution (the Bell Curve), which (in case of most variables) occurs naturally if you
have 50 observations or more.. If you have less, the distribution is never ‘normal’ and in that
case you cannot use the variable for most statistics
 WHICH MEASURES APPLY DEPENDS ON THE DATA PROPERTIES
- MEASURES OF CENTRAL TENDENCY:

 Thus if you have:


o Metric data: mean, median, mode
o Ordinal data: median, mode
o Nominal data: mode
- MEASURES OF DISPERSION

 Thus, if you have


o Metric data: variance, standard deviation, (interquartile) range
o Ordinal data: (interquartile) range
o Nominal data: measures of variability do not apply
 WHAT DO MEASURES OF VARIABILITY TELL YOU?
o In isolation not much
o Considered relatively, they indicate the relative “(dis)agreement” in terms of data.
 The lower the measure of variability, the more “agreement”
 The higher the measure of variability, the more “disagreement”
a variable with no variance cannot be used in any multivariate analysis, because it will never
show any correlation with other variables
- DESCRIPTIVE STATISTICS IN SPSS- 2 OPTIONS
o OPTION 1
Analyze > Descriptive Statistics > Frequencies (dialog box “statistics”)
o OPTION 2
Analyze > Descriptive Statistics > Descriptives (dialog box “options”)
NOTE:
o Under option 1 you can request the “range”; this needs to calculate by hand
under option 2
o The interquartile range needs to calculate by hand under both options
5. Univariate techniques
- THE RIGHT TEST AT THE RIGHT TIME AND PLACE

 Which tests we can conduct depends on the type of data, and the samples (groups) we want to
compare.
 If the data are non-metric, the number of tests is very limited.
 When they are metric, it depends on the number of groups we want to compare
- NATURE OF THE TEST VARIABLE: The test variable is the variable we investigate
o Variable on which you want to draw conclusions
o Variable on which you compare (the different) sample(s)
o When your test variable is:
 Metric (interval and ratio) -> parametric tests
 Nonmetric (nominal and ordinal) -> nonparametric tests
- ASSUMPTIONS PARAMETRIC TESTS:

Before we can use the parametric tests we need to check a few things:

o Is our test variable metric?


o Is it ‘normally distrubuted’ (does it have a symmetric distribution around the mean, in the
form of a bell)? You may want to inspect the distribution of values visually, however SPSS can
do a pretty simple ‘normality check’: descriptives (test the kurtosis and skewness of the
variable: if the absolute values of both are below 3 the variable may be considered ‘normally
distributed).
- NUMBER OF SAMPLES INVOLVED
o Samples refers to the groups you want to compare
o Your data may consist of many samples (often differs per analysis)
o The variable that helps you identify the relevant groups is the grouping variable
o In selecting a statistical test it is relevant to distinguish between
 One sample
 Two or more samples (2,....,k samples)
 ONE DATA SET, DIFFERENT NUMBERS OF SAMPLES  Number of groups (or samples) all
depends on how you ‘group’ your subjects.
- IN CASE OF TWO SAMPLES OR MORE:
o Unrelated or independent samples
 The answers given in one sample are not influenced by the answers in
the other sample
o Related samples
 The answers in one sample are influenced by the other sample-usually,
the same respondent is assessed twice
 Measuring the same people at different time points
 Comparing the same people on multiple variables (e.g. brand
preferences)
- metric univariate tests:
o The one sample t-test compares the mean in a sample with a fixed number. (the
mean grade is higher than X, lower than X)
o The paired samples t-test compares values between related samples. (wine A is
generally liked better than wine B, by the same person)
o The independent samples t-test (or one-way ANOVA) compares means
between two independent groups (e.g., on average, men live shorter than
women)
 SOME COMMUNALITIES OF THESE TESTS:
o They all deal with mean-differences
o The underlying principle is the same
GENERAL PROCEDURE UNIVARIATE STATISTICAL TEST

5.1.One-sample t test:
- Compare whether the sample mean is different from some test value.

-
- μ is a predetermined number , it is the value to which you want compare your sample
mean μ0.
- ANALYZE > COMPARE MEANS > ONE SAMPLE T-TEST
o TEST VARIABLE: the variable you wish to investigate
o TEST Value: the value you wish to compare with
5.2.Independent samples t-test
- Compare whether two independent groups have a different mean score on a test
variable
- Samples are independent when the scores on the test variable in one sample are not
influenced by (“independent of”) the scores on the test variable in another sample

-
- μ1 is the mean of group 1 and μ2 is the mean of group 2
- ANALYZE > COMPARE MEANS > INDEPENDENT SAMPLES T-TEST
o Enter the test variable (the variable of which you want to compare the means
between the groups (exam scores, IQ, etc.).
o Define the groups with a new variable: E.g., GENDER (0,1)  group variables
- SELECTING THE APPROPRIATE IND. SAMPLES T-TEST
o How the appropriate independent samples t-test is calculated depends on
whether or not the variance in the two samples is equal.
o Different formulas apply
o Again, SPSS computes both and leaves the decision which t-value is the
appropriate one to you
o To choose the appropriate t-value use Levene’s test
o Hypotheses Levene’s test (F-test)

5.3.Paired samples t-test


- Compare the mean score for two related/dependent samples
- Dependent samples means that we analyze two measurements from each subject.
- These two measurements can be:
o Through time (e.g., height of child j in 1999 and height of child j in 2005)
o Two different variables (e.g., person j´s perceived image of Brand A and person
j’s perceived image of Brand B)

-
o Indicates the differences between two related sample means. Other forms are
possible (e.g., μ0 – μ1 = 0).
- ANALYZE > COMPARE MEANS > PAIRED SAMPLES T-TEST
o Enter the two test variables (there are always two!) that you wish to compare
(e.g., wine A score, wine B score)
5.4.One-way ANOVA
- Compare whether the sample means differ between two or more independent samples
o Dependent variable: metric variable
o Independent variable: must be categorial variable (non-metric), if including both
categorial and metric variables  ANCOVA (analysis of covariance)
o If all independent variables are metric  regression.
- In case of:
o two unrelated samples it does not matter whether you use an independent
samples t-test or a one-way ANOVA
o More than two unrelated samples you must use a one-way ANOVA (a series of
independent t-tests may yield erroneous results)  can compare many
different samples

-
o This is the only appropriate form of hypotheses for One-way ANOVA
- ANALYZE > COMPARE MEANS > ONE-WAY ANOVA
o Enter the test variable (s) (you can do ANOVA between the same groups on
several variables: preference, attitude, satisfaction, etc.)
o Define and Enter the grouping variable (e.g., Nationality (0,1,2,3)
o Tick the post-hoc tests in the options box (always tick all of them!)
- POST HOC COMPARISON TESTS
o If the null hypothesis is rejected, we conclude that at least one group is different.
o But which group? Post hoc comparisons are used to determine which groups differ
from each other.
o They have the following format:

o If you open the “post hoc” dialogue box in the ANOVA menu you see that there
are a lot of different post hoc tests
 Which one to pick?
 Homogeneity of variance test
H0: all groups have equal variances
H1: groups do not have equal variances (at least one differs)
 Equal variances: Tukey
 Unequal variances: Dunnett’s C
 To avoid double work always click the homogeneity of variance test (“options”) and both post
hoc tests
6. Translation into statistical hypotheses

-
- Always trans late your hypothesis (the answer to your research question) into a
statistical hypothesis (one that can be tested!)
7. Wrap up
- Data analysis
o helps you to make sense out of your data
o formulates answers to your research questions
- Data analysis techniques
o Can be viewed as necessary evil, but are crucial
o And will continue to gain importance
o Are numerous and the trick is to know
 What techniques are available
 When to use what technique

DATA ANALYSIS FOR MARKETING RESEARCH II


1. Introduction

-
- in the domain of ‘dependence’ and ‘interdependence’ between variables, there are two
techniques:
o Correlation is bi-directional, and therefore called an ‘interdependence’
technique (Variable A ‘varies along’ with variable B and variable B ‘varies along’
with variable A)
o Regression is directional (For example Variable A (independent variable, e.g.
Satisfaction) has a positive effect on Variable B (dependent variable, e.g.,
Loyalty), but not the other way around)
2. Pearson product moment correlation
- ASSUMPTIONS / WHEN TO USE
o Two metric variables  If you do not have metric variables, you may need
different techniques (Chi-square, etc.)
o Normally distributed variables
 SPSS offers options for non-normally distributed variables, but Pearson
Correlation is pretty ‘robust’ (even if your variables are not ‘perfectly
normally distributed’ you will still get a reasonable estimate)
 Per definition a Pearson product moment correlation takes into account the relation between
only two variables (i.e., ignores the impact of other possible factors)
- EXAMPLES OF HYPOTHESES
o How strongly are sales related to advertising expenditures?
o Is customer satisfaction positively related to customer loyalty?
o Is attitude towards a technology positively related to usage intentions of that
technology?
- STATISTICAL HYPOTHESES
o H0: Variables X and Y are not related (r=0)
o H1: Variables X and Y are related (r≠ 0)
 Hypotheses that can be tested are the hypotheses that do not specify a direction in the effects.
- CHARACTERISTICS
o The Pearson product moment correlation (r) summarizes the strength of the
linear relationship between two variables.
 The correlation coefficient is ‘r’.
o r ranges from –1 to +1
 It varies between 1 (full positive linear relationship) and -1 (full,
negative linear relationship).
o r is a symmetric measure of association. This means that rxy = ryx
o r = 0 means that is no linear relationship. It is does not mean that X and Y are
not related. There might well be a nonlinear relation, which is not captured by
r. To examine this use scatterplots.
 Scatterplots (one variable on X-axis and the other on the Y-axis: plotting
the relationships (pairs, (x,y)) are very useful to explore the nature of
the relationship between variables
o A PEARSON PRODUCT MOMENT CORRELATION OF 0? (r=0)

- Analyze>Correlate>Bivariate
o Pearson is selected by default
o Select the relevant variables
- SPSS OUTPUT CORRELATION ANALYSIS
oProvided are the correlation coefficient, the accompanying p-value, and the
sample size
3. REGRESSION ANALYSIS
3.1.Bivariate regression
- GENERAL BIVARIATE REGRESSION MODEL

- Regression models have the same formula as a straight line. In this case the formula expresses a
linear relationship between two variables: the dependent (explained) variable and the explaining
variable (independent variable).
- The software will try to find a straight line through your observations.
- The straight line may not go through the origin (0,0), but it may hit the Y-axis at a positive or
negative value (this is called the Intercept).
- The regression coefficient says something about the strength of the effect of the independent
variable on the dependent value. It is a value between 0-1. It is visible in the slope of the line: Beta =
0 means (flat, horizontal line), Beta = 1 means 45° line.

3.2. Multivariate regression


- GENERAL MULTIVARIATE REGRESSION MODEL

 Same for a multidimensional (multiple)regression. Only thing is, that this represents a line in a
multidimensional space; As many dimensions as variables. Coordinates look like vectors!
Calculations in SPSS take the form of matrix calculations (matrix algebra).
- SOME MAIN ASSUMPTIONS
o Dependent variable is metric
o Independent variable(s) are metric or dummy-coded
o Model is correctly specified: form and variable(s)
o No exact linear relationship between Xi’s (only for multiple regression)
- The discussion on least squares estimation and model evaluation is the same for bivariate and
multivariate regression analysis! For the sake of clarity the least squares procedure is explained in a
bivariate context.
3.3.Regression analysis
- MY DATA POINTS (COME DIRECTLY FROM QUESTIONNAIRE)

 Look at the dots: every dot represents a data pair (x, y), where y is the value of the dependent
variable for a specific x value of the independent variable: every dot is one observation. For
example ‘my dot’ is: (7,6), if I have answered 7 on the satisfaction measurement and 6 on the
loyalty measurement.
- GRAPHICAL PRESENTATION OF REGRESSION PARAMETERS

- WHAT IS THE BEST SET OF PARAMETERS?


- THE OLS ESTIMATION PROCEDURE
o There are many estimation methods (i.e. ways the coefficients α and β’s are determined)
o We focus on Ordinary Least Squares OLS estimation  OLS: ordinary least squares = an
algorithm used to estimate the best line through the data points.
o The goal of OLS is to choose coefficients in such a way that it minimizes the squared distance
between the predicted Y value and the real Y-value (i.e. we minimize the error sum of squares
(ESS))
o OLS ensures optimal prediction accuracy
o ORDINARY LEAST SQUARES PROCEDURE

o Error = distance between the observation (dot) and the optimal line.
o You can see that the algorithm tries to globally minimize the error (difference
between the line and the dots)
- THREE SOURCES OF VARIATION:

Total Variation = Regression Variation + Error Variation

 This shows in the ANOVA output of your regression in SPSS. (error = residual).
 The Anova says something about the ‘quality of your model’: is the straight line really a good
representation of the underlying pattern in the data.
3.4.Interpretation (2 stages)

-
 Only if the overall model performance (stage 1) is good (i.e., significant) we proceed with stage 2.
 The ANOVA represents the quality of your model (does your model ‘fit’ the data structure?) NB the
fit does need to be good (your Anova needs to be significant!) to be able to say something
meaningful about the relationships between the variables!
 The regression coefficients say something about the effects of each variable on the dependent
variable.
3.4.1. STAGE 1: OVERALL MODEL PERFORMANCE
- COEFFICIENT OF DETERMINATION (R2)
o Expresses the relative amount of variance (RSS) of the dependent variable that
is explained by the independent variable(s) (how much of the variance in your
dependent variable is explained by the variance in the independent variables)

 The higher the better? Yes, but it comes at a price


o R-square = 1 : all variance in the dependent variable is explained by the independent
variables. Wow! (this never happens..., but in Marketing we’re very happy with an r-
square of 0.7)
- R2: THE HIGHER, THE BETTER?
o The total sum of squares (TSS) present in Y remains constant
o The regression sum of squares (RSS) can be increased by adding more and more
independent variables.
o As a consequence, R2 increases (yippie, more variance explained!)
o Principle of parsimony (efficient modeling)
o Adjusted coefficient of determination (R2 adj)
o R2-adj makes a tradeoff between including additional variables and their
explanatory power (i.e. increase in RSS).
 Adding more and more independent variables (even unrelated ones) will increase the r-square.
 Therefore, always report the adjusted r-square, because it corrects for the number of
independent variables
- RELATIONSHIP R2-ADJ AND R2

 WHAT IS A HIGH ENOUGH VALUE OF R2 OR R2-ADJ?  THE F-DISTRIBUTION APPLIES!


- HYPOTHESES OVERALL MODEL FIT

Or equivalently

 The overall model fit hypothesis (stating that r-square adjusted is bigger than 0) should be
significant. Once we have established this (in the ANOVA reported in the regression output: the
F-value should be significant), we can starting to test the model hypotheses.

3.4.2. STAGE 2: INTERPRETATION OF INDIVIDUAL COEFFICIENTS


- Significance of individual coefficients (t-test)
o SIGNIFICANCE INDIVIDUAL COEFFICIENTS

 If the coefficient is significant, we can reject the null-hypothesis (no effect)


- Relative importance of the coefficients (beta-weights) (individual coefficients)
o Two types of coefficients b’s (unstandardized) and β’s (standardized)
o Coefficient b denotes the change in Y when X changes with one unit.
o Coefficient β denote the relative importance of the X’s
o β’s range from -1 to 1; and they are directly comparable
o Higher β = Higher importance of X variable in explaining Y.
 In most Marketing research we only focus on the standardized Betas, because the absolute
relationship between the values of the variables has no meaning.
3.5.REGRESSION AND MULTIPLE ITEM VARS
- We need to decompose it in two parts:
o 1) The effect of the three independent variables on the intermediate variable
(Attitude)
o 2) The effect of the intermediate varible on the dependent variable (Intent to use).
- So, we get a ‘system’ of regression equations.
- In (standard versions of) SPSS it is not possible to simultaneously estimate the two
regressions, so we need to do it sequentially! First 1, then 2.
- In other software packages (and expensive modules you can buy for SPSS) it is possible!
(SmartPLS, Lisrel etc.)
- Analyze>Regression>Linear

 The model ‘fits’ your data


 And two of your independent variables have a significant effect on the dependent variable,
together explaining 30% of its variance (adjusted r-square).  only take significant
independent variables into account.
 The intercept (Constant) is also significant, so the line crosses the Y-axis at a value of 2,570).
3.6.Dummy variables
- Purpose: inclusion of nonmetric independent variables (e.g. gender; social class; age categories)
- Number of dummies needed: (number of categories – 1)
- INTERPRETATION DUMMY VARIABLES
o Suppose we want to examine whether age (<25; 25-55; 55+) and quality are related to
satisfaction.
o Coding scheme:

o We estimate the following equation:

 In some cases we would like to see if a categorical variable has an effect on a dependent
variable. To investigate this, we need to code dummy variables for the categories.
- In SPSS: TRANSFORM > RECODE INTO DIFFERENT VARIABLES
o Select the variable you want to recode into a dummy (or dummies)
o Specify the new variable (for example dummy01)
o Specify which values of the original variable (old values) will correspond with
the dummy (new values)
 R-square does not equal zero and F is significant  So the regression is valid.

 the dummies do not have a significant effect on the dependent variable...


 So it is only quality, and not age , that determines satisfaction.

3.7.Multigroup analysis
- We assumed that the relationship between the dependent and independent variables
was equal for the entire sample
- BUT WHAT IF……..
We are interested in comparing this relationship across several groups?
 To address this question we need to do a group comparison (or a multigroup analysis) 
compare Betas between groups
- Comparing regression equation: 4 possibilities
o Option 1: same Beta, but different constant
o Option 2: same constant, different Betas
o Option 3: Betas are different, and Constants are different
o Option 4: there is no difference between the groups.
- TEST SEQUENCE

3.7.1. EQUALITY OF EQUATIONS (CHOW TEST)


- Estimate the following models
o Model on pooled data (all groups together)
o Model separately for each group
- Based on the statistical output perform a Chow test (by hand!)
- Hypotheses Chow test
o H0: Regression equations for the different groups are equal (coincide)
o H1: At least one equation is different
 Only when we can reject the H0 we proceed by testing for equality of slopes and intercepts!
- CHOW TEST (2 groups)

o
o ESS = Error sum of squares of the different models (pooled and per group)
o K = number of parameters in underlying model
o N = sample sizes of the different groups
 Can easily be extended to accommodate more than two groups
 You need to do the regressions for each group and the ‘pooled’ groups (full sample)
3.7.2. Equality of intercepts
- In a regression context this is a fairly complicated test to perform
- As a proxy perform an independent samples mean-differences test (independent
sample t-test) on the outcome variable.
- Results independent samples t-test (groupvar = gender; testvariable = irritation ad)
t(67) = 10,385 (p < 0,001)
- We conclude that the intercepts are different
3.7.3. Equality of slopes (“Paralellism”)
- To be sure that Betas are the same (between groups), we need to examine this by
means of a t-test.
- MULTIGROUP HYPOTHESES-EQUALITY OF SLOPES
o Estimate the regression equation separately for each group (you did this already
for the Chow test)
o Subsequently, apply the test below to assess whether the coefficients are
significantly different

 b1 and b2 are the regression coefficients you want to compare of


respectively subsample 1 and 2
 SEb1 and SEb1 are the standard error of the coefficients b1 and b2
 m is the size of subsample 1
 n is the size of subsample 2
 The degrees of freedom for the test is m+n-2
o For example:
H0: The relationship between age and irritation by ad does not differ
between men and women (bmen = bwomen)
H1: The relationship between age and irritation by ad differs between men
and women (bmen ≠bwomen)
You need the following output

- If you need to do analysis for a part of your sample, you can use this command in SPSS to ‘split’ the
file  Data > Split File
o Do not forget to turn the split file procedure off when you are finished with your analysis

WRAP UP:
- Suppose we have following model:

-
- must do the multivariate regression, and NOT two correlations
- WHY? ONLY MULTIPLE REGRESSION IS VALID!
o Takes into account relationships among the different IV’s
o As such each coefficient reflect the unique contribution of each independent
variable on Y.
 THUS: If your model tells you that there are several variables “causing” an outcome (i.e.,
dependent variable); estimate a model that analyzes these multiple variables simultaneously.
- BIVARIATE REGRESSION AND CORRELATION COEFFICIENTS
o In case of a single dependent and independent variable, the Pearson product moment
correlation coefficient (r) equals the standardized regression coefficient (Beta).
o Even though correlation coefficients and standardized bivariate regression analysis provide
you with the same information, bivariate regression allows you to make predictions.

OBSERVATION
1. Introduction
- We are going to discuss the stages of research design (i.e., the answer to the question:
How should we set up the experiment?) and the collection of the data.
- Data analysis techniques that are used in most experiments have been discussed before:
for most experiments, we use t-tests, ANOVA, or MANOVA.
- In many so-called field experiments, we simply observe. The result of the analysis of the
data is a description of behavior (observation  descriptive design).
2. Observation
2.1.What is observation?
o Recording and quantification of behavioral patterns
o In a systematic way
o To obtain information about a phenomenon
o Without direct communication with subjects
 All experiments rely on observation. In so-called field experiments, researchers try to observe
the phenomena of interest as ‘flies on the wall’: The observed subjects do not notice that they
are being observed, so as to be able to observe their ‘natural behavior’.
2.2.Observation methods
There are several ways to collect and analyze observational data:
- Personal observation: the researcher him or herself observes the object of interest. This can be
taken an active form (mystery shopper), or a more passive form.
o Mystery shopping
 Monitoring the competition
 Assessing employees` performance
o Shopping pattern studies
 Way through aisles
 Time spent at places
o Shopper behavior studies
 Effect of shelf height, positioning of products
 Reading of labels
- MECHANICAL (DIGITAL) OBSERVATION: high tech devices (Smart watches, Mobile phones, smart
cameras, microphones, etc.) are used, instead of personal observation, to ‘mechanically’ observe
the phenomena of interest
o Eye-tracking monitors
 Reading of ads
 Viewing of commercials
o Skin response
 Evaluation of commercials
 Package design
o Voice pitch analysis
 Analysis of emotional responses
 Product preferences
- Audit: use the digital traces that customers leave behind as data
o Pantry audit: Pantry audit is possible, because many new fridges are equipped
with detectors and the products with RFID tags
o Loyalty cards
o Commercial scanning data
- Content analysis: is a way to try and understand the ‘meaning’ of the observed phenomena
o Skin color of actors in food commercials
o Use of words (and frequency) in ads
o Role of women in commercials
o Trending topics on twitter
- Trace analysis: is the explicit use of traces left by the consumer during their customer journey.
Digital and visual data can be combined to get a more complete picture.
o Finger prints
o Parking lots
o Internet cookies
o Location-based Services
2.3.Necessary conditions

Conditions for behaviors to be observed:


- Observable: Behavior of interest must be observable to be useful. If you cannot see, hear, feel or
otherwise observe it, it cannot be used in research
- Relevant: Behavior must be relevant to the phenomenon of interest. Otherwise it cannot be used to
study the phenomenon
- Setting
- Time
o it must be possible to observe: researchers need to have access to place, setting, and time
where the behavior takes place
 Is observation used in reality? Many companies use field observations to better understand
how consumers use their products and services. They use these observations to develop ideas
for improvement and innovation.
3. Wrap up:
- Advantages
o Accuracy
o Actual behavior
o Sometimes the only possibility
 In observations, you get what you ‘see’. It reflects ‘the reality’ of consumer behavior in a
natural environment.
- Disadvantages
o No insight in motives
 it does not necessarily help us understand the underlying ‘reasons’ of
observed behavior
o Lack of generalizability
 In general, we cannot base any conclusions (on field observations)
about why people behave the way they do.
 As a consequence, we also do not know if the observed behavior can be
generalized to the population or to other contexts.
 In marketing research we often need to know why consumers behave
the way they behave, and to which extent we can generalize.
o Not always possible

 If we simply wish to ‘describe’ or explore behavior we can use either ‘observation’ or surveying
(using a questionnaire)
 Surveying is used when it is difficult to get access to the ‘situation’ (place-wise, time-wise,
situation-wise) we want to observe.
 Surveying is cheaper…

CAUSAL RESEARCH: EXPERIMENTS


1. Introduction
- Experimentation is the best (purest) way to test causal relationships between variables
(for example between attitudes and behavior (-al intentions))
- In general, on the left side of the equation we have the ‘explanatory’ (or independent)
variables, and on the right side the explained variable (the dependent variable).
- WHAT CAUSED THE EFFECT?
o Use experimentation to isolate the effect of a strategic action
o Often, of course, the dependent variable is explained by many different things
(independent variables). If we want to establish the effect of a single (or a
specific combination of) variable(s), we need to isolate these effects, in an
experimental design.
2. What is experimentation?
- THE BASIC IDEA EXPERIMENT-DUMBED DOWN VERSION
o A researcher
o Selects multiple comparable groups
o One of the groups under-goes a treatment
o All groups are measured on some outcome of interest
o The difference in outcome is believed to be caused by the treatment
 An example of a ‘treatment’ could be: exposure to time pressure, or
uncertainty, or friendliness of a salesperson…
 An example of an outcome could be: purchase intention, attitude
towards a brand, attitude towards a channel
- THE BASIC IDEA OF AN EXPERIMENT-REAL LIFE EXAMPLE
o look at this experiment that was executed by ‘Rite aid’ a large pharmacy chain in
the US. They wanted to figure out if playing ads in the stores would affect sales
of the advertised product.
o They selected 20 similar stores. They played ads in ten of their stores and did
not play the add in the other 10.
o And they counted sales of the advertised product in all 20 stores. Then they
compared sales in the 10 stores that were exposed to the ‘treatment’ with sales
in the 10 other stores.
o They tested the hypothesis: In-store radio advertisement causes an increase in
sales of the advertised products
o Comparison of the results showed that the sales of the products doubled
- CAUSALITY
o Causality: something causes something else. IF A HAPPENS, B MUST HAPPEN
TOO, if there is a causal relation between A & B.
- Experiment: “The manipulation of one or more independent variables (X`s) to examine
whether there is a functional relationship in one of more dependent variables (Y´s) as a
result of this manipulation, while controlling for extraneous variables.”
 In experiments we manipulate the independent variables (causes) to see the effect on the
dependent variable.
 We keep all other variables constant, so we can focus on the effect of only the manipulated
variables!
3. KEY VARIABLES IN EXPERIMENTAL DESIGNS
- INDEPENDENT VARIABLES (X)
o Values of these variables are directly manipulated by the researcher. Also called
treatment variable or causal factor.
o Categorical variable
o Examples include: price (high vs. low), package design (old vs. new), advertising
(theme 1 vs theme 2 vs theme 3).
o It is possible to have more than one independent variable in an experiment
 In experiments, each independent variable is always seen as a dichotomous variable, with only
two states: high and low!
- DEPENDENT VARIABLES (Y)
o Outcome variable which is measured for each respondent regardless the
treatment he was subjected to.
o Thus, even for the control group (i.e., no manipulation), the same outcome
measure applies for the respondent.
o Measured after the treatment OR before and after the treatment
o Behavioral (e.g., sales) and perceptual outcomes (e.g., awareness, buying
intentions, attitude, etc.)
- EXTRANEOUS VARIABLES
o Extraneous variables are variables that may also affect the dependent variable
(e.g., the mood of the respondent), but that we are not focusing on. This is why
they need to be either kept ‘constant’, or ‘controlled for’, so that they do not
interfere with the experiment.
- CONTROLLING FOR EXTRANEOUS VARIABLES

We have several ways to design experiments in such a way that the effect of extraneous variables is
neutralized, or controlled for

o Randomization
o Matching
o Statistical control
o Design control
4. EXPERIMENTAL DESIGNS

- In every experiment we need to measure the (effects on the) dependent variable.


- We also need a group of subjects, the experimental group, that will be exposed to the
treatment (the manipulation of the independent variable).
- We also need a group of subjects that is NOT exposed to the treatment (we need them
to compare our experimental group with) (control group)
- Furthermore, we need ‘randomization’ to address the problem of the extraneous
variables.
DIFFERENT EXPERIMENTAL DESIGN
 If we test a manipulation on one group, there is no real experiment. It is called a ‘pre-
experimental design’. In such a design we do not control for the effects of extraneous (or
confounding) variables.
4.1.Pre-experimental design
- ONE-SHOT CASE STUDY
o Symbolically
X -> O1
NB: X = the (manipulation of) the independent variable.
O1 = the Observation (of the effect on the Outcome (dependent variable)) in an
experimental group at time 1.
o No randomization – possible self-selection or arbitrary selection by researcher
o No comparison possible
o The level of O1 might be affected by many extraneous variables
- ONE-GROUP PRE-TEST POST-TEST DESIGN
o Symbolically: O1 X O2
 In this experiment we measure (make an observation before the
exposure to the treatment) at time 1, then we expose the group to the
treatment (X). Then we do another observation (of the output) at time
2.
 So O1 X O2 reads like: Observation before treatment -> Treatment -> Observation after
treatment
o No randomization – possible self-selection or arbitrary selection by researcher
o The treatment effect is O1 - O2
o Extraneous variables remain uncontrolled for (even under randomization)
o Only one group is observed.
- STATIC GROUP DESIGN:
o Symbolically
EG X O1
CG O2
 In this design wee have two groups: the experimental group (EG) and the
control group (CG)
 We expose EG to Treatment X and then do an Observation after the treatment.
 We do not expose CG and then make an observation.
o No randomization – possible self-selection or arbitrary selection (convenience)
by researcher
o Mortality effects (especially with unpleasant treatment)
o The treatment effect is O1 - O2
o Some self-selection effects can be reduced by matching the EG and CG

Important warnings:

- Pre-experimental designs stink (unpleasant)


- No control extraneous variables
- Low internal and external validity
- Appreciate true experimental designs
4.2.True experimental designs
- POST-TEST ONLY CONTROL GROUP DESIGN
o Symbolically
EG: R X O1
CG: R O2
 We expose a randomized EG (X)
 We do not expose a randomized CG
 We make observations of the EG after treatment and of the CG and
compare these observations (t-test, ANOVA).
o The treatment effect equals O1 - O2
o No pre-measurement:
 Not possible to assess possible self-selection
 Testing-effect is eliminated
o Individual change cannot be assessed
o Advantages: time, cost, sample size requirements
- PRE-TEST-POST-TEST CONTROL GROUP DESIGN
o Symbolically
EG: R O1 X O2
CG: R O3 O4
- This design controls for many extraneous variables
- Thus, (O2 – O1)-(O4 – O3) = effect + IT
 Even better: We have randomized EG and CG  then we make an observation ‘before
treatment’ of both groups  Then we expose the EG to the treatment  Then we make
observations of both groups again.
 The bias remaining is IT, which results either from imperfect randomization (sometimes
perfect randomization is simply not possible), or because we do not know to what extent the
first observation may have influenced the results of the experiment (e.g., by priming the
subjects, i.e., making them aware of what the experiment is focusing on!).
- FOUR-GROUP-SIX-STUDY DESIGN (SOLOMON)
o This design helps you to even rule out the IT effects (priming bias)
o In symbols you carry out the following design

- Drawback: time-consuming and expensive to implement


 The distinguishing features of true experimental designs are
o Randomization
o Control group
o However, not always possible or feasible in practice!
 Statistical designs offer an alternative way to control for the effect of extraneous variables
 The factorial design is a much-used statistical design
 Especially with multiple X’s, factorial designs are useful
4.3.Factorial designs
- Still applies with 3 types of variables:
o Independent variables (may be multiple)
o Dependent variables
o Extraneous variables
 Number of treatment = number of categories of independent variable 1 x … x number of
categories of independent variables k

 Here the 2*2 design is clearly visible. It leads to a matrix with four combinations of levels: LL,
LH, HL, and HH (H = high, L=low).
- By also measuring the extraneous (confounding) variables that could also have
influenced the dependent variable, we can ‘control’ for them
- FACTORIAL DESIGNS: MAIN EFFECTS
o Main effect (per factor!) is the effect of the factor on the outcome while
ignoring the effect of the other experimental factors
o For every experimental factor there is a main effect
- FACTORIAL DESIGNS: INTERACTION EFFECTS
o An interaction effect is present when the effect of a factor is not constant over
all levels of the other factor
o An interaction effect means that the main effect of a factor varies with the
different levels of the other factor (can be attenuated or reinforced).
o In contrast, if there is no interaction between the factors that means that the
difference in cell means caused by a factor is constant over all levels of the other
factor
 The presence of an interaction effects means that you have to be careful when interpreting the
main effects (“depends on”)
No interaction effects:

Interaction effects:
WRAP UP
- Causal research is about establishing cause-effect relationships
- Descriptive studies offer too little control, experiments are needed
- In order to have valid results, one needs to rule out the effects of so-called extraneous
variables
- In practice, factorial designs are most used
- In general, experiments are considered a double-edged sword
- Expensive
- But it’s the only game in town available

You might also like