0% found this document useful (0 votes)
13 views

Data Analysis - Print Formatted

The document discusses data analysis techniques for user experience research and design. It covers topics like descriptive statistics, inferential statistics, parametric vs nonparametric tests, and measures of central tendency. The document explains how data analysis provides an objective basis for decision making and helps understand user behavior, measure satisfaction, validate designs, and prioritize improvements.

Uploaded by

Armando Arratia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Data Analysis - Print Formatted

The document discusses data analysis techniques for user experience research and design. It covers topics like descriptive statistics, inferential statistics, parametric vs nonparametric tests, and measures of central tendency. The document explains how data analysis provides an objective basis for decision making and helps understand user behavior, measure satisfaction, validate designs, and prioritize improvements.

Uploaded by

Armando Arratia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

3/31/24

Data Analysis
Dr. Robert Atkinson

1 2

Objectives Objectives

By the end of this lecture series, you will be able to: • Distinguish between parametric test for comparing
• Understand how data analysis serves as backbone means and their implications for design impacts, and
for evidence-based decision-making, driving design nonparametric tests for analyzing ordinal or non-
improvements and enhancements in user experience. normal data, understanding their contextual
application in design research.
• Explain importance of descriptive statistics in
summarizing data sets, facilitating communication, • Explain how correlation and regression analyses help
and laying groundwork for further analysis. examine relationships between continuous variables
and design implications of these relationships.
• Describe concept of inferential statistics and its role in
extrapolating sample data findings to broader • Describe use of Chi-square and binomial tests for
populations, guiding strategic design decisions. investigating associations and proportions in
categorical data.

3 4

Importance of Data Analysis

• Evidence-based Decision Making: Provides


objective, quantitative evidence to support decision-
Importance of Data Analysis making processes. Instead of relying on
assumptions or subjective opinions, usability
professionals use data to make informed decisions
about product design and improvements.
• Understanding User Behavior: Researchers can
uncover patterns and trends in how users interact
with a system. This helps identify areas where users
may be experiencing difficulties, allowing for
targeted improvements to enhance overall UX.

5 6

1
3/31/24

Importance of Data Analysis Importance of Data Analysis

• Measuring User Satisfaction and Performance: • Prioritizing Improvements: Helps in prioritizing


Enables measurement of key metrics such as user design and usability improvements based on their
satisfaction, task completion time, and error rates. impact on user experience. By identifying most
Metrics are essential for evaluating effectiveness of significant usability issues, resources can be
a design and understanding user experience. allocated more effectively.
• Validating Design Choices: Can validate whether • Benchmarking and Comparative Analysis: Allows
design changes lead to improvements in user for benchmarking against industry standards or
experience. Crucial for justifying design decisions competitors. Comparative analysis can reveal
and demonstrating the value of usability work. strengths and weaknesses in a product’s usability,
guiding strategic improvements.

7 8

Importance of Data Analysis Descriptive Statistics

• Risk Mitigation: By identifying usability issues early • Data Summarization: Help in summarizing large
through data analysis, potential risks and problems volumes of data into meaningful information, using
can be addressed before they escalate, reducing measures of central tendency (e.g., mean, median)
cost and effort of making changes later in and variability (e.g., variance, standard dev). Makes
development process. it easier to understand and communicate essential
• Enhancing User-Centric Design: Data-driven features of data.
insights foster a user-centric approach to design, • Descriptive Statistics: Summarize and describe
ensuring that user needs and preferences are characteristics of a dataset, providing clear overview
central to development process. This leads to more of its properties without making conclusions beyond
user-friendly and successful products. data. Simplifies large amounts of data for easy
understanding, provides basis for further analysis,
helps in identifying patterns, trends.

9 10

Inferential Statistics Parametric Statistics

• Inferential Analysis: Statistics enable researchers • Assumes that underlying distribution of population
to infer or generalize about a population based on data is normal (Gaussian) distribution.
sample data. Through hypothesis testing, statistics • Used to estimate key parameters of population, such
help in determining likelihood of certain outcomes as mean or variance, based on sample data.
and the reliability of data.
• Assumption of Normality: Assumes data are
• Inferential Statistics: Use a sample of data taken
drawn from normally distributed population.
from population to make inferences or predictions
about population. Used to make predictions and • Interval or Ratio Data: Often require data on
generalizations about population based on sample interval or ratio scale, where precise and equal
data. distances between measurements are meaningful.
• Homogeneity of Variances: Assumes that variance
is same across the groups being compared.

11 12

2
3/31/24

Nonparametric Statistics

• Do not require assumptions about


population’s distribution. Often used when
assumptions cannot be met.
• Distribution-Free: Do not require data to
follow any specific parametric distribution.
• Ordinal Data and Non-normal Data: Used
with ordinal or skewed that might violate
assumptions of parametric tests.
• Robustness: More robust to violations of
data assumptions.

13 14

Measures of Central Tendency

• Central tendency is a statistical measure to


determine a single score that defines center of a
Descriptive Statistics distribution.
• Goal is to find single score that is most typical or
most representative of entire group.
• But there is no single, standard procedure for
determining central tendency.
• Instead, statisticians have developed three different
methods for measuring central tendency: mean,
median, and mode.

15 16

Measures of Variability Measures of Distribution Shape

• Variability is a measure differences between scores • These measures give an idea of the symmetry and
in a distribution and describes the degree to which tail of distribution.
scores are spread out or clustered together. • Skewness: A measure of the asymmetry of
• Variability measures how well an individual score (or probability distribution. Positive skewness indicates
group of scores) represents the entire distribution. a distribution with a longer tail to the right, and
negative skewness indicates a longer tail to left.
• Deviation is distance from the mean:
deviation score = X - µ • Kurtosis: A measure of the "tailedness" of
distribution. It indicates how much of data are in the
• Variance is mean of the squared deviations. tails and peak, compared to a normal distribution.
• Standard deviation is square root of variance and
provides a measure of the standard, or average
distance from mean.

17 18

3
3/31/24

Measures of Frequency Distribution Measuring Variables

• Frequency Counts: Simplest form of frequency • To establish relationships between variables,


distribution, which tallies the number of times each researchers must observe the variables and record
distinct value or category appears in dataset. This is their observations.
often presented in a frequency table, listing each • Requires that variables be measured.
unique value alongside its count. • Process of measuring a variable requires a set of
• Relative Frequency: Shows proportion of total categories called a scale of measurement.
number of observations represented by each • Process classifies each individual into one
category, calculated by dividing frequency of each category.
category by total number of observations.

19 20

Measurement Scales Measurement Scales

• A nominal scale consists of a set of categories that • An interval scale consists of ordered categories
have different names. that are all intervals of exactly the same size.
• Label and categorize observations, but do not • Equal differences between numbers on scale
make any quantitative distinctions between reflect equal differences in magnitude.
observations. • However, the zero point on an interval scale is
• An ordinal scale consists of a set of categories that arbitrary and does not indicate a zero amount of
are organized in an ordered sequence. the variable being measured.
• Ranks observations in terms of size or • A ratio scale is an interval scale with the additional
magnitude feature of an absolute zero point.
• Ratios of numbers reflect ratios of magnitude.

21 22

Inferential Statistics

23 24

4
3/31/24

Tests for Comparing Means


Inferential Statistics (Parametric)

• These methods enable us to infer trends, make • Independent-Sample t-test: Compares means of
predictions, and draw conclusions about population two independent groups to determine if there is a
parameters based on sample statistics. statistically significant difference between them.
• Core idea is to determine probability that observed • Dependent-Sample t-test (Paired or Related-
difference or relationship in the sample data occurs Sample): Compares means of two related groups,
by chance or represents a true effect in population such as same group measured at two different times
• Include tests for: (a) comparing means, (b) or under two different conditions.
comparing medians, and (c) relationships, • Analysis of Variance (ANOVA): Used when
categorical differences, and proportions comparing means of three or more groups or
conditions. ANOVA tests whether there is a
significant difference among group means, indicating
that at least one group differs from the other

25 26

Tests for Comparing Medians Tests for Relationships, Categorical


(Nonparametric) Differences, and Proportions

• Mann-Whitney U: Compares medians of two • Relationships: Tests designed to explore and


independent groups. It is used when data are not quantify the relationships between continuous
normally distributed. variables, whether predicting outcomes or
• Wilcoxon Signed-Rank: Compares medians of two understanding the degree of association.
related groups, suitable for paired data where the • Categorical Differences: Focus on understanding
normality assumption is not met. how different categories relate to each other or differ
• Kruskal-Wallis: Extension of Mann-Whitney U test significantly. Implies analyzing differences not based
for comparing medians of three or more independent on numerical averages or medians but on
groups. distribution and association within categorical data.
• Proportions: Analysis of binary or categorical
outcomes to see whether observed proportion of
category differs significantly from what was expected
or predicted.

27 28

Tests for Relationships, Categorical


Differences, and Proportions

• Correlation and Regression: Explores relationship


between continuous variables. Correlation assesses
strength and direction of association, while
regression models relationship, allowing for
prediction of one variable based on another.
• Chi-square: Non-parametric test that assesses
association between two categorical variables.
Determines if observed frequencies in categories
differ significantly from expected frequencies.
• Binomial: Evaluates whether observed proportion of
successes in binary data significantly deviates from
hypothesized proportion with dichotomous variables

29 30

5
3/31/24

Definition and Purpose

• Definition: Independent-sample t-test, also known


as two-sample t-test, is used to determine if there is
Independent-Sample t-Test a significant difference in mean values of two
independent groups. This test is appropriate when
comparing means of two distinct groups to see if
their population means differ.
• Purpose: Utilized to assess equality of means
between two groups, helping researchers infer
whether observed differences in sample means
reflect a true difference in population, or if they are
likely due to random chance.

31 32

Use Case Example 1 Use Case Example 2

• Study Title: Comparing Usability Between Two • Study Title: Testing Efficiency of Different Search
Interface Designs Algorithms
• Objective: To evaluate which of two interface • Objective: To investigate whether a new search
designs provides a better user experience in terms algorithm enhances the efficiency of search results
of usability. compared to traditional algorithm.
• Application: An independent sample t-test could be • Application: An independent sample t-test can be
applied to compare usability scores obtained from applied to measure and compare the time users
two groups of users, each interacting with a different spend finding correct information using each
interface design. This test would help determine if algorithm, helping to conclude if new algorithm
there is a statistically significant difference in significantly improves search efficiency.
usability between the two designs.

33 34

Example Scenario Example Results

• Imagine a UX researcher is investigating whether a • Suppose test results show that Interface B users
new mobile app interface (Interface B) improves have statistically significantly higher satisfaction
user satisfaction compared to the current interface scores, with following statistics:
(Interface A). • Interface A Mean: 3.8 (SD = 0.6)
• Users are randomly assigned to use either Interface • Interface B Mean: 4.5 (SD = 0.5)
A or Interface B, and their satisfaction levels are • t(98) = -6.25, p < 0.001
measured on a Likert scale. • Results would be reported in a research paper or
• After collecting the data, researcher performs an presentation as follows:
independent-sample t-test to compare mean
satisfaction scores between two groups.

35 36

6
3/31/24

Example Reporting

• In examining impact of new app interface on user


satisfaction, an independent-sample t-test was
conducted.
• Analysis revealed a significant difference in user
satisfaction scores between old interface (Interface
A, M = 3.8, SD = 0.6) and new interface (Interface B,
M = 4.5, SD = 0.5); t(98) = -6.25, p < 0.001.
• Findings suggest that Interface B significantly
enhances user satisfaction compared to A.

37 38

Definition and Purpose

• Definition: Paired-sample t-test, also known as


Dependent-Sample t-Test dependent-sample t-test or related-sample t-test, is
used to compare means of two related groups.
(Paired or Related-Samples) Groups are typically 'paired' because they are
related in some way, such as same subjects
measured at two different times or under two
different conditions.
• Purpose: Determines whether mean difference
between paired observations is statistically
significant. Useful for before-and-after studies,
repeated measures on same subjects, or matched-
pairs experiments.

39 40

Use Case Example 1 Use Case Example 2

• Title: Before-and-After Design Evaluation: Usability • Title: A/B Testing with Same Participants:
Impact of Color Scheme Changes Navigation Schema
• Objective: To understand effect of changing color • Objective: To evaluate two different navigation
scheme on usability of an application. schemas on a website to determine which is more
• Application: Usability metrics (like error rates or intuitive.
satisfaction levels) are collected from users before • Application: Users navigate website using Schema
and after implementing a new color scheme. A and later with Schema B. Dependent sample t-test
Dependent sample t-test assesses significance of compares number of navigation errors or time taken
changes in usability, quantifying impact of color to find information across schemas, identifying more
scheme revision. user-friendly option.

41 42

7
3/31/24

Example Scenario Example Results

• Suppose a UX researcher is conducting a study to • Assume average task completion times are as
evaluate impact of a new user interface design on follows:
task completion time in a software application. • Before new interface implementation: 10
• Users are timed while completing a set of tasks minutes (SD = 2 minutes)
using old interface, and then timed again completing • After new interface implementation: 8 minutes
same tasks after new interface is implemented. (SD = 1.5 minutes)
• After conducting dependent samples t-test,
researcher finds a significant reduction in task
completion time, with a t-value of -4.50 and a p-
value of 0.001.

43 44

Example Reporting

• In our study assessing impact of the new interface


design on task completion efficiency, we recorded
task completion times before and after implementing
new design.
• Average completion time decreased from 10
minutes (SD = 2) before redesign to 8 minutes (SD =
1.5) afterward.
• A dependent samples t-test revealed a significant
reduction in task completion time (t = -4.50, p =
0.001), indicating that new interface design
significantly improved user efficiency in completing
the tasks

45 46

Definition and Purpose

• Definition: A statistical procedure that typically


compares means of three or more independent
groups to determine if there are significant
Analysis of Variance (ANOVA) differences among them. Assesses effect of single
independent variable (categorical) on continuous
dependent variable across different groups.
• Purpose: Aims to identify whether variations in
group means are larger than what would be
expected by chance, suggesting that independent
variable has a significant effect on dependent
variable.

47 48

8
3/31/24

Use Case Example 1 Use Case Example 2

• Title: Comparing User Satisfaction Across Multiple • Title: Evaluating Task Completion Times for
Website Designs Different Software Tools
• Objective: To determine which of several website • Objective: To ascertain which software tool allows
designs yields highest user satisfaction. users to complete tasks most efficiently.
• Application: Conduct a one-way ANOVA to • Application: Use one-way ANOVA to analyze time
compare user satisfaction scores obtained from taken by users to complete specific tasks using
surveys after participants interact with each of different software tools. This approach can highlight
different website designs. This can help identify tool that facilitates fastest task completion, indicating
design that best meets users' preferences and its efficiency and potential for productivity
needs. improvement.

49 50

Example Scenario Example Results

• Imagine a UX researcher wants to compare • Assume average task completion times (in seconds)
efficiency of three different app interface designs, A, for three designs are as follows:
B, and C, in terms of task completion time. • Design A: Mean = 120, SD = 15
• Users are randomly assigned to use one of three • Design B: Mean = 105, SD = 10
designs to complete specific tasks, and time taken to • Design C: Mean = 130, SD = 20
complete these tasks is recorded.
• After performing one-way ANOVA, researcher finds
an F-statistic of 5.67 with a p-value of 0.004.

51 52

Example Reporting

• In our study assessing efficiency of three app


interface designs, we measured time users took to
complete certain tasks.
• A one-way ANOVA was conducted to compare
these times across the designs, resulting in an F-
statistic of 5.67 and a p-value of 0.004. T
• This indicates a statistically significant difference in
task completion time among three designs.
• Post-hoc analyses revealed that Design B led to
significantly faster task completion compared to
Designs A and C, suggesting that B is most efficient
design among three evaluated.

53 54

9
3/31/24

Definition and Purpose

• Definition: Mann-Whitney U test, also known as


Wilcoxon rank-sum test, is a non-parametric
Mann-Whitney U statistical test used to compare ranks of two
independent groups. It assesses whether one group
tends to have higher or lower values than other on a
particular measure, without assuming a normal
distribution of data.
• Purpose: This test is particularly useful for
comparing ordinal data or continuous data that do
not meet assumptions required for parametric tests.
It determines if there is a statistically significant
difference in central tendency between two groups.

55 56

Use Case Example 1 Use Case Example 2

• Title: Comparing User Satisfaction Ratings Between • Title: Comparing Task Difficulty Perception Between
Two App Versions Two Software Tools
• Objective: To determine if there is a significant • Objective: To investigate whether users perceive
difference in user satisfaction ratings between two one software tool as more difficult to use than
versions of a mobile application. another.
• Application: Employ Mann-Whitney U test to • Application: Analyze ordinal data on task difficulty
compare ordinal satisfaction ratings (e.g., on a scale (rated from 'very easy' to 'very difficult') for users
from 'very dissatisfied' to 'very satisfied') collected performing same task using two different software
from users of two different app versions. tools.

57 58

Example Scenario Example Results

• Imagine a scenario where a UX researcher wants to • Let's assume that satisfaction scores are as follows:
determine if a new interface design (Design B) • Design A: Median = 3, with range from 1 to 5.
improves user satisfaction compared to existing • Design B: Median = 4, with range from 2 to 5.
design (Design A).
• After applying the Mann-Whitney U test, the
• Users are randomly assigned to interact with either researcher finds a U value of 45 with a p-value of
Design A or Design B, and their satisfaction levels
0.04.
are measured on a Likert scale.

59 60

10
3/31/24

Example Reporting

• In our comparative study of user satisfaction


between two interface designs, we conducted Mann-
Whitney U test to analyze satisfaction scores
obtained from users.
• Median satisfaction score for Design A was 3 (range
1–5), while for Design B, it was 4 (range 2–5).
• Mann-Whitney U test yielded a U value of 45 and a
p-value of 0.04, indicating a statistically significant
difference in user satisfaction between two designs.
• This suggests that users generally found Design B
to be more satisfying than Design A.

61 62

Definition and Purpose

• Definition: Wilcoxon Signed-Rank test is a non-


parametric statistical test used to compare two
related or paired samples to determine whether their
Wilcoxon Signed-Rank Test population mean ranks differ. It's non-parametric
alternative to paired-sample t-test and is used when
data do not meet normal distribution requirement.
• Purpose: Assesses whether there is a significant
difference between the median values of two paired
groups. It's particularly useful for ordinal data or
when assumptions of parametric tests (like
normality) are not met.

63 64

Use Case Example 1 Use Case Example 2

• Title: Evaluating Changes in User Perceived Ease • Title: Analyzing Impact of Navigation Schemes on
of Use Perceived Efficiency
• Objective: To determine if a UI redesign has • Objective: To evaluate if changes in website
improved perceived ease of use. navigation schemes result in differences in users'
• Application: Use Wilcoxon Signed-Rank test to perceived efficiency.
compare ordinal ease-of-use ratings from users • Application: Employ Wilcoxon Signed-Rank test to
before and after the redesign. Users rate their compare users' ordinal ratings of perceived
experience on a Likert scale (e.g., 1 to 5), and the efficiency before and after implementing a new
test assesses if redesign led to a significant navigation scheme, identifying the change is
improvement. perceived as beneficial.

65 66

11
3/31/24

Example Scenario Example Results

• Imagine a scenario where a UX researcher aims to • Assume the following hypothetical satisfaction
evaluate impact of a new feature in a mobile app on scores:
user satisfaction. • Pre-update: Median rating = 3, range 2 to 4
• Users rate their satisfaction on an ordinal scale (1 to • Post-update: Median rating = 4, range 3 to 5.
5) before and after feature is implemented. • After conducting Wilcoxon Signed-Rank test,
researcher obtains a test statistic value (e.g., W) of
23 with a p-value of 0.05.

67 68

Example Reporting

• In our study assessing impact of new app feature on


user satisfaction, we utilized Wilcoxon Signed-Rank
test to compare user satisfaction ratings before and
after feature's implementation.
• Pre-update median satisfaction was 3, with ratings
spanning from 2 to 4.
• Post-update, median satisfaction increased to 4,
with a range from 3 to 5.
• Wilcoxon Signed-Rank test yielded a statistic of W =
23 and a p-value of 0.05, indicating a statistically
significant improvement in user satisfaction following
the feature update.

69 70

Definition and Purpose

• Definition: Kruskal-Wallis test is a non-parametric


statistical test used to compare medians across
Kruskal-Wallis H three or more independent groups. It is non-
parametric equivalent of one-way ANOVA and is
used when the data do not meet the normality
assumption required for parametric tests.
• Purpose: Assesses whether there is a statistically
significant difference in central tendency (typically
the median) among three or more groups, without
assuming a normal distribution of data.

71 72

12
3/31/24

Use Case Example 1 Use Case Example 2

• Title: Studying the Impact of Different Content Types • Title: Analyzing User Satisfaction with Different
on User Engagement Interaction Modalities
• Objective: To investigate how different types of • Objective: To evaluate user satisfaction across
content (text, video, interactive) affect user different interaction modalities (e.g., touch, voice,
engagement on a learning platform. gesture).
• Application: Apply Kruskal-Wallis test to compare • Application: Utilize Kruskal-Wallis test to compare
engagement levels (measured on an ordinal scale ordinal satisfaction data from users interacting with
like low, medium, high) across small groups each modality, suitable for small sample sizes and
exposed to different content types, effectively ordinal data, identifying which modality yields the
handling non-normal data distributions. highest satisfaction.

73 74

Example Scenario and Results Example Reporting

• Suppose a UX researcher is evaluating user • In our comparative study assessing user satisfaction
satisfaction of three different website layouts (A, B, across three website layouts, we conducted Kruskal-
and C). Wallis test to analyze ordinal satisfaction ratings.
• Users interact with each layout and then rate their • Median satisfaction scores were 3 (IQR = 1-4) for
satisfaction on an ordinal scale (1 to 5). Layout A, 4 (IQR = 2-5) for Layout B, and 2 (IQR =
1-3) for Layout C.
• Assume median satisfaction scores are as follows:
• Layout A: Median = 3, IQR = 1-4 • Kruskal-Wallis test resulted in an H-statistic of 6.83
• Layout B: Median = 4, IQR = 2-5 with p-value of 0.033, a statistically significant
difference in user satisfaction among the layouts.
• Layout C: Median = 2, IQR = 1-3
• Post-hoc analyses were conducted to identify which
• After performing Kruskal-Wallis test, researcher specific layouts differ significantly from each other.
finds H-statistic value of 6.83 with p-value of 0.033.

75 76

Correlation

77 78

13
3/31/24

Definition and Purpose Use Case Example 1

• Definition: Used to measure and describe strength • Title: Correlating Time on Site with User Satisfaction
and direction of relationship between two continuous • Objective: To determine if there is a relationship
variables. It quantifies degree to which variables between amount of time users spend on a website
change together. and their overall satisfaction.
• Purpose: To determine whether a relationship exists • Application: Perform a correlation analysis between
between two variables and how strong that
the time users spend on various pages of website
relationship is. This can help in understanding how and their satisfaction ratings, helping to understand
changes in one variable are associated with if longer engagement correlates with higher
changes in another. satisfaction.

79 80

Use Case Example 2 Example Scenario

• Title: Studying Correlation between Navigation • Imagine a scenario where a UX researcher wants to
Complexity and Task Completion Rates investigate the relationship between page load times
• Objective: To explore how complexity of website and user satisfaction ratings on a website.
navigation impacts ability of users to complete tasks • Users are asked to rate their satisfaction after using
successfully. the website, and their ratings are then correlated
with page load times they experienced.
• Application: Correlate metrics of navigation
complexity (such as the number of clicks required to
find information) with task completion rates to see if
more complex navigation systems negatively affect
user performance.

81 82

Example Results Example Reporting

• Suppose analysis yields following data: • In our analysis of relationship between website page
• Average page load time: 3.2 seconds (SD = load times and user satisfaction, we observed that
0.8) average page load time was 3.2 seconds (SD = 0.8),
• Average user satisfaction rating: 4.5 out of 5 with an average satisfaction rating of 4.5 (SD = 0.6).
(SD = 0.6) • A correlation analysis revealed a Pearson's r of -
0.75, suggesting a strong negative relationship
• Correlation coefficient (Pearson's r) calculated
between page load times and user satisfaction between page load times and user satisfaction.
ratings is -0.75, indicating a strong negative • This indicates that longer page load times are
relationship. associated with lower user satisfaction levels.
• These results highlight the critical impact of website
performance on user experience, underscoring need
for optimization efforts to enhance user satisfaction.

83 84

14
3/31/24

Simple Linear Regression

85 86

Definition and Purpose Use Case Example 1

• Definition: Used to model the relationship between • Title: Predicting User Engagement from Page
a single independent variable and a dependent Layout Features
variable. It estimates how changes in independent • Objective: To determine how specific features of a
variable are associated with changes in dependent page layout, such as amount of text or number of
variable, assuming linear relationship between two. images, predict user engagement levels.
• Purpose: To predict value of dependent variable
• Application: Use simple regression to analyze how
based on value of independent variable and to variations in layout features (independent variable)
quantify strength and nature of their relationship. It impact metrics of user engagement, like time spent
helps in understanding how independent variable on the page or interaction rates (dependent
affects dependent variable and in making informed variable).
decisions based on this relationship.

87 88

Use Case Example 2 Example Scenario

• Title: Evaluating Effect of Notification Frequency on • Imagine a scenario where a UX researcher wants to
App Usage investigate how the number of clicks required to
• Objective: To explore how frequency of notifications complete a task affects user satisfaction on a
influences app usage behavior. website.
• Application: Conduct simple regression analysis to • In this case, number of clicks is independent
variable, and user satisfaction (measured on a
predict changes in app usage time or session
frequency (dependent variable) based on number of scale, such as 1 to 10) is dependent variable.
notifications sent to user (independent variable).

89 90

15
3/31/24

Example Results Example Reporting

• Assume following hypothetical data: • In our study examining impact of task complexity,
• Average number of clicks: 5 (SD = 2) measured by number of clicks, on user satisfaction,
• Average user satisfaction rating: 7.5 (SD =
we found average number of clicks was 5 (SD = 2),
1.5) with an average satisfaction rating of 7.5 (SD = 1.5).
• Simple regression analysis might yield a regression • Simple regression analysis revealed significant
negative relationship between number of clicks and
equation with a slope (beta coefficient) of -0.3 and a
significant p-value (e.g., 0.01), indicating that an user satisfaction, beta coefficient of -0.3 (p = 0.01).
increase in clicks leads to a decrease in satisfaction. • This suggests that for each additional click required
to complete a task, user satisfaction decreases by
0.3 points.
• Highlights importance of streamlined task design in
enhancing user satisfaction on website.

91 92

Chi-Square Test

93 94

Definition and Purpose Use Case Example 1

• Definition: Chi-square (χ²) test is a non-parametric • Title: Examining User Preferences Across Different
statistical test used to assess whether observed Design Elements
frequencies for categorical variables differ from • Objective: To determine if there are significant
expected frequencies. It’s often used to test differences in user preferences for various design
independence of two variables or the goodness of fit elements (e.g., color schemes, typography, layout
between observed data and a theoretical styles).
distribution.
• Application: Use Chi-square test to analyze survey
• Purpose: To determine if there is a significant data where users select their preferred design
association between two categorical variables or if a elements from multiple categories, identifying which
single categorical variable differs significantly from elements are favored across different user
expected patterns. demographics.

95 96

16
3/31/24

Use Case Example 2 Example Scenario

• Title: Assessing User Interaction with Notification • Imagine a UX researcher conducting a study to
Modals determine if there is a preference pattern for three
• Objective: To analyze whether user interactions different design styles (Minimalist, Modern,
with notification modals (e.g., closing, ignoring, Traditional) across two user groups (novices and
engaging) vary based on modal's content type experienced users).
(informative, warning, promotional). • Users are asked to choose their preferred design
• Application: Chi-square test used to evaluate style in a survey.
interaction data with different modal types to identify
if user responses are dependent on the content
type, aiding in optimization of notification strategies.

97 98

Example Results Example Reporting

• Assume survey results are as follows: • In study assessing design style preferences among
• Novice users: 30 prefer Minimalist, 25 prefer novice and experienced users, we observed: 30
Modern, and 15 prefer Traditional. novice users preferred Minimalist, 25 Modern, and
• Experienced users: 20 prefer Minimalist, 35 15 Traditional; among experienced users, 20
prefer Modern, and 25 prefer Traditional. preferred Minimalist, 35 Modern, and 25 Traditional.
• Chi-square test examined association between user
• After conducting Chi-square test, researcher
calculates a χ² value of 6.72 with a p-value of 0.035. experience level and design preference, yielding a
χ² value of 6.72 with a p-value of 0.035.
• Indicates a significant association between user
experience level and design style preference,
suggesting that experienced users have different
design preferences compared to novice users.

99 100

Binomial Test

101 102

17
3/31/24

Definition and Purpose Use Case Example 1

• Definition: Binomial test is a non-parametric • Title: Evaluating Success Rate of User


statistical test used to determine whether proportion Authentication Process
of success in a dataset significantly differs from a • Objective: To assess whether success rate of a new
hypothesized proportion. It's applied when data user authentication process (e.g., biometric vs.
consist of binary outcomes (success/failure, yes/no, password) meets or exceeds an acceptable
etc.) and is used to analyze situations where there threshold.
are two possible outcomes for each observation.
• Application: Apply binomial test to compare actual
• Purpose: Assesses whether observed frequency of success rate of users completing authentication
success in a sample is consistent with a specified process against hypothesized success rate,
probability of success, providing a way to evaluate determining the process's effectiveness.
outcomes against a specific benchmark or expected
proportion.

103 104

Use Case Example 2 Example Scenario and Results

• Title: Analyzing User Preference in A/B Testing • Consider a scenario where a UX researcher wants
• Objective: To ascertain if there is a significant user to determine if success rate of completing a certain
preference for one version of a product feature over task using a new interface design is above
another in A/B testing scenarios. hypothesized success rate of 70%.
• Application: Conduct a binomial test to see if • Users attempt task, and their success or failure is
recorded.
preference for one version significantly deviates
from a 50/50 split, indicating a clear user preference • Let’s say that out of 100 users, 80 successfully
for one of the tested versions. complete task.
• Researcher uses binomial test to assess whether
this success rate of 80% significantly exceeds
expected rate of 70%.

105 106

Example Reporting

• In our evaluation of new interface design, 80 out of


100 users (80% success rate) successfully
completed the designated task.
• To determine if this observed success rate
significantly surpasses our hypothesized benchmark
of 70%, we conducted a binomial test.
• Test results indicated a p-value of 0.025, suggesting
that success rate with new interface is significantly
higher than expected 70%.
• This finding underscores effectiveness of new
design in facilitating task completion among users.

107 108

18

You might also like