0% found this document useful (0 votes)

35 views11 pages

Facebook Status Colour Change AB Tetsing Case Study

Facebook status background colour change case study.

Uploaded by

abhinav.sharmaew

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views11 pages

Facebook Status Colour Change AB Tetsing Case Study

Facebook status background colour change case study.

Uploaded by

abhinav.sharmaew

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Product Sense -

Product/Feature Design & Launch Recommendations

Lecture Objective:
● Business sense towards A/B testing.
● How to launch A/B testing to get “valid” and “reliable” results.
○ Developing hypothesis
○ Designing A/B tests
○ Evaluating test results
○ Making decisions
● Launch recommendations and pitfalls with A/B testing set ups.

Example Cases: A/B testing Use Cases

● Multivariate Testing
● Split Testing
● Conversion Rate Optimization
● Landing Page Optimization
● Online Experimentations

Judgment Criteria & General Framework -

Keep this in mind when addressing business acumen questions.

● Design an experiment:
○ Structure
○ Right Flow of Steps
○ Develop the right Hypothesis
○ Sample Size Calculation
○ Duration
○ Pitfalls
● Identifying the right metric for testing significance - North Star metric
● Sample Segmentation for valid results - How would you take care of nuances
and sample bias?
● How conclusively will the candidate be able to make a decision based on A/B
test results?

Case Study:

How would you test if Facebook incorporating colored backgrounds to statuses

improves user engagement?
1. Clarify goal of the feature / idea conception :

a. What do you hope to achieve with this feature incorporation/update?

b. Why this specific feature update for that goal and not any other feature.
c. Has this been experimented before? Are other product lines also
following suit?
d. Is it because of previous experiment data, industry insights, reports or
other evidence that supports your hypothesis?
e. Is this feature for a specific user group or for all user groups?

2. Discuss the metrics that feature expects to bring an impact to and the data
available.

a. Give a list of metrics and finalize the metric we want to test significance
on.
i. Metrics - In this case study we can focus on the primary impact of
the feature being more user engagement, daily active users etc.
ii. User Engagement can be defined as % of active users who have
engaged with Facebook in some way (like, comments, save,
reactions).
iii. Daily Active Users - # of unique users who have logged on
Facebook each day. We expect this metric to increase with this
new feature
b. Come up with north star metrics, supporting metrics (if applicable) and
guard rail metrics.
i. North Star Metric - % of user with engagement
ii. Supporting Metric - Daily Active Users
iii. Guard Rail Metric (This is the metric that should not degrade in
pursuit of a new feature) - % of media content (assuming media
content provides more value, we don't want this % to decrease
because of this feature). Or because this feature takes up so much
space, are we seeing lesser number of posts on average that
people interact with
3. Experimentation - How to design an experiment?

a. Set up hypothesis (State the null hypothesis & alternate hypothesis) -

What would be the null and alternative hypothesis in the case?
i. Null Hypothesis - There is no significant difference in user
engagement between the treatment and control groups.
ii. Alternative Hypothesis - There is a significant difference in user
engagement between the treatment and control groups.

b. Choice of test - Since we are comparing two ratios, we can use the
Z-proportions test.

c. Choosing experiment control & treatment subjects

i. Who is the experiment being run on?
ii. Are we targeting all users on the platform? Or should we pick a
proper segment of users for whom we feel this test will be
particularly well suited

d. Sample Size Calculation

i. Baseline metrics
1. Assume that before this feature launch, the user
engagement is around 45%
ii. Minimum detectable effect - what change is considered
meaningful enough for you to take an action
1. Assume that the business stakeholders are hoping for a 1%
increase in user engagement in the treatment group
iii. Significance level (Usually 95%)
iv. Power (Usually 80%)
v. With the above number, assume you would roughly need 40K
users in each group for us to design this experiment in a
statistically significant manner.

e. Experiment Duration - Based on sample size estimated and the

approximate traffic -
i. Divide sample size by the number of users in each group
1. Since we need a sample size of 80K (40K in each group)
based on the above calculation
2. Assuming FB gets a traffic of 5K everyday
3. Experiment duration = 80/5 = 16 days
f. Significance testing after we have reached the required sample size on
the north star metric to identify significance.
g. Continue monitoring supporting metrics and guard rail metrics.

4. Testing Pitfalls - How to avoid common challenges / experiment bias?

a. Experimental Design Bias
i. Novelty / Primacy Effect -
1. Primacy Effect - When changes happen, some people that
got used to how things work may feel reluctant to change
a. Some users in the treatment group are reluctant to
try out the new feature as they were used to the
older status UI so they stop using FB much to post
status.
b. So user engagement for first 2 weeks are low Wk1 =
45% and Wk 2 = 48%
c. But as these reluctant users see more users engaging
with this colored status button, they will slowly start
using this feature more.
d. So from Wk3 onwards , the user engagement
stabilizes to 62%
e. It's important to not take the first 2 weeks of low user
engagement due to the primacy effect into
consideration when comparing with control.
i. Here it would have shown that there is no
significant difference between the two groups
in the first 2 weeks even though subsequently
we see that this feature actually gets the users
more engaged.
2. Novelty Effect - These users resonate with the new change
and use more frequently
a. Some users in the treatment group got excited about
the new feature.
b. The excited users use this feature and subsequently
engage more in the first two weeks after which the
excitement dies down.
c. So user engagement for first 2 weeks are high Wk1
= 65% and Wk 2 = 68%
d. But from Wk3 onwards , the user engagement
stabilizes to 52%
e. It's important to not take the first 2 weeks of high
user engagement due to novelty effect into
consideration when comparing with control.

3. Both of these effects are not long term effects, so it’s

important that results are not biased due to this effect.

Treatment results may get exaggerated/undermined initially

due to these effects.

4. Solutions:
a. Run the experiment for a longer time than required if
possible to observe for any novelty or primacy effect.
b. The test can be conducted only on the first time
users.
c. Compare first time users with experienced users in
the treatment group (we can get an estimated impact
of primacy / novelty effect).

ii. Group Interference Qs -

Interference between variants happens a lot. It's important to
select your sample in such a way that this interaction doesn’t
cause biased results.
1. Eg: IF the treatment group is seeing a positive effect
because of this new FB status feature.
2. This effect can spill over to the control group (who is not
seeing the new feature and makes new posts seeing their
friend who is affected by the new feature in the treatment
group). This is called a network effect.
3. So in this, the difference underestimates the treatment
effect.
4. In reality the difference may actually be more than 1% but
due to network effect, Actual Effect > Treatment Effect.
5. Hence giving an incorrect result that this new feature did
not significantly impact the north star metric.

b. Outcome Bias
i. Look out for other design or system issues that led to the actual
effect being undermined or over estimated to the treatment effect.

5. Recommendations based on experiment results - Launch or not?

a. Link results to the goal and business impact

i. Example: What does 1% lift in engagement rate translate to
revenue?
1. If the 1% lift is increasing revenue through Ads by $20M, it
might be worth it, however if it only increases revenue by
$50K it might not be (based on efforts estimation).
ii. Is it worth it to launch the product given all the costs?
iii. While the perfect scenario is that the increase in success metrics
are significant and we don’t see any difference in the guardrail
metrics - Give recommendations on what to do in case of
conflicting situations.
iv. Example: There's an increase in % user engagement among active
users but also the daily active users have decreased.
v. Translate this to impact to users and business -
1. Is the increased engagement among existing users bringing
increased revenue to balance out the loss of some daily
active users?
2. For eg:
a. Let's say the daily active users were 5K earlier but
now it has come down to 3K.
b. However the user engagement has increased from
45% to 65%
c. If the increase in user engagement has led to a
revenue increase despite the loss of daily active
users. This feature might be worth the consideration.
d. ITs good to give a thought to strategy to retain the
daily active users as next steps.
b. Consider short term and long term impact of the launch -
i. Sometimes a short-term impression increase can conflict with the
brand image or company’s mission in the long run.
ii. One reasonable suggestion could be even with the decrease in
daily active users, the launch of color background search could
potentially bring in more engaged users to the platform and in the
long term, the benefit may outweigh the drawbacks.

Use Case of A/B Testing across various domains:

Instructor Note:
● Next we have some quizzes based on different use cases of A/B Testing.
● Launch the quiz and then ask the learners to share their views on goals of each
of these, to make it more engaging and to help them understand better.
______________________________________________________________________________

E-commerce Checkout Page Button Color:

Q. An e-commerce website wants to optimize its checkout page. They run an A/B test
where half of the visitors see a green "Buy Now" button, while the other half sees a red
"Buy Now" button.

What could be the goal of this experiment? Choose the most appropriate answer.

A. The goal is to determine which color increases the click-through rate and
leads to more purchases.
B. The goal is to assess whether changing the button color impacts customer
satisfaction with the website design.
C. The goal is to analyze the impact of button color on the average time spent on
the checkout page.
______________________________________________________________________________

Email Subject Lines for a Newsletter:

Q. A media company is testing different subject lines for its weekly newsletter. They
send two versions of the newsletter, each with a different subject line, to a subset of
their subscribers.

What would be the most appropriate action item for the media company as per the
ongoing experiment?

A. They conduct a survey to gather feedback from subscribers about the subject
line for their respective group.
B. They change the font style & color among the two versions of the newsletter to
see if it has any impact on subscriber interaction.
C. They increase the frequency of newsletter distribution for one group to see if
more frequent newsletters result in higher engagement.
D. They measure open rates to determine which subject line is more effective at
capturing reader attention.
______________________________________________________________________________

E-commerce Product Page Layout:

Q. An online retailer is experimenting with the layout of its product pages. They test
two variations: one with the product description at the top and one with customer
reviews at the top.

What would be the most appropriate choice of next step in the currently ongoing
experiment?

A. They analyze the number of social media shares for each product to understand
the impact of the layout variations on brand popularity.
B. The company measures the click-through rate and conversion rate to
determine which layout leads to more sales.
C. They conduct a survey to gather customer preferences on the more appealing
layout of the product pages of the website.
______________________________________________________________________________

Streaming Service Content Recommendation Algorithm:

Q. A streaming service wants to enhance its content recommendation algorithm. They

split their user base into two groups: one group receives recommendations from the
current algorithm, and the other group experiences recommendations from a new
algorithm.

What would be the most appropriate choice of next step in the currently ongoing
experiment?

A. They analyze the number of products recommended during the testing period to
assess the impact of the algorithm changes.
B. They monitor the recommendations generated by the new algorithm and
compare them with the recommendations from their rival platforms.
C. They compare user engagement metrics, such as time spent watching
content and user ratings among the two algorithms to decide which one is to
be kept as default for the entire user base.
D. They analyze the variety of categories of products recommended during the
testing period to assess the impact of the algorithm changes.
______________________________________________________________________________

Test Plan: Product Name: Opencart (Frontend)
88% (8)
Test Plan: Product Name: Opencart (Frontend)
8 pages
Quiz
0% (1)
Quiz
23 pages
ISTQB Certified Tester Foundation Level Practice Exam Questions
From Everand
ISTQB Certified Tester Foundation Level Practice Exam Questions
Gabriel Awoyemi
5/5 (1)
Download
No ratings yet
Download
7 pages
A - B Testing
No ratings yet
A - B Testing
27 pages
Data Science Product Questions
No ratings yet
Data Science Product Questions
92 pages
Google (DA) 面试准备
No ratings yet
Google (DA) 面试准备
20 pages
A - B Test Guide
No ratings yet
A - B Test Guide
33 pages
EP-Experimentation Best Practices-200125-135749
No ratings yet
EP-Experimentation Best Practices-200125-135749
5 pages
AB Test Notes
No ratings yet
AB Test Notes
7 pages
AB Cheatsheet
No ratings yet
AB Cheatsheet
13 pages
Test What Matters - Level-Up Your Product Experiments With Behavioral Data
No ratings yet
Test What Matters - Level-Up Your Product Experiments With Behavioral Data
12 pages
A-B Testing - Framework-2025061017080742
No ratings yet
A-B Testing - Framework-2025061017080742
5 pages
What Is Bidirectional Traceability?: Manual Testing A&Q
No ratings yet
What Is Bidirectional Traceability?: Manual Testing A&Q
73 pages
Data Science Product Questions
No ratings yet
Data Science Product Questions
93 pages
25 A - B Testing Concepts You Must Know - Interview Refresher
No ratings yet
25 A - B Testing Concepts You Must Know - Interview Refresher
7 pages
2020-04-08 A - B-Tests - Jasmin Yaya
No ratings yet
2020-04-08 A - B-Tests - Jasmin Yaya
42 pages
MANUAL TESTING FAQ's
No ratings yet
MANUAL TESTING FAQ's
4 pages
Udacity AB Testing Course Final Test Report
No ratings yet
Udacity AB Testing Course Final Test Report
7 pages
Activity-Based Techniques: Test Design Techniques (18 April 2019)
No ratings yet
Activity-Based Techniques: Test Design Techniques (18 April 2019)
71 pages
AB Testing Cheat Sheet
No ratings yet
AB Testing Cheat Sheet
13 pages
Following Is The Strategy We Used in One of My Projects:: Define Brain Stromming and Cause Effect Graphing? With Eg?
No ratings yet
Following Is The Strategy We Used in One of My Projects:: Define Brain Stromming and Cause Effect Graphing? With Eg?
10 pages
AB Testing CH 3
No ratings yet
AB Testing CH 3
14 pages
SQR: Balancing Speed, Quality and Risk in Online Experiments
No ratings yet
SQR: Balancing Speed, Quality and Risk in Online Experiments
9 pages
4 ABTesting
No ratings yet
4 ABTesting
18 pages
Unit - III Test Management
No ratings yet
Unit - III Test Management
77 pages
2021-02-16 - Why AB Testing Is Likely Making Your UX Worse
No ratings yet
2021-02-16 - Why AB Testing Is Likely Making Your UX Worse
6 pages
Udacity Ab Testing Final Project
No ratings yet
Udacity Ab Testing Final Project
6 pages
DDDM Lecture3 ExperimentBasics Dec11
No ratings yet
DDDM Lecture3 ExperimentBasics Dec11
38 pages
White Paper: Radview Software
No ratings yet
White Paper: Radview Software
8 pages
Student Introduction To The Simulation Updated 11-27-23 - FINAL
No ratings yet
Student Introduction To The Simulation Updated 11-27-23 - FINAL
11 pages
B Experiments
No ratings yet
B Experiments
3 pages
Lessons, Tactics, and Stories From World-Class Experimenters
No ratings yet
Lessons, Tactics, and Stories From World-Class Experimenters
23 pages
A Comprehensive Getting Started Guide To A/B Testing
No ratings yet
A Comprehensive Getting Started Guide To A/B Testing
8 pages
FutureProofing 3 - Innovation Sprints A Guide To Experimentation and Interation
No ratings yet
FutureProofing 3 - Innovation Sprints A Guide To Experimentation and Interation
18 pages
Software Testing Dec 2020
0% (1)
Software Testing Dec 2020
39 pages
Rapid Software Testing
No ratings yet
Rapid Software Testing
174 pages
2012-06 ART ControlledExperimentsTutorialAll
No ratings yet
2012-06 ART ControlledExperimentsTutorialAll
49 pages
U141201B
No ratings yet
U141201B
109 pages
Important Interview Questions
No ratings yet
Important Interview Questions
22 pages
Product Sense Batch 1 Lecture 1
No ratings yet
Product Sense Batch 1 Lecture 1
8 pages
Challenges in A/B Testing Mobile Native Apps
0% (1)
Challenges in A/B Testing Mobile Native Apps
31 pages
Practical Guide To Controlled Experiments On The Web: Listen To Your Customers Not To The Hippo
No ratings yet
Practical Guide To Controlled Experiments On The Web: Listen To Your Customers Not To The Hippo
34 pages
HTSM
No ratings yet
HTSM
5 pages
(12.0) Agile Testing
No ratings yet
(12.0) Agile Testing
18 pages
Usability Test Plan
No ratings yet
Usability Test Plan
3 pages
Spring20 Lecture6 Exploratory
No ratings yet
Spring20 Lecture6 Exploratory
64 pages
How To Build Products Users Love
100% (2)
How To Build Products Users Love
41 pages
Test Plan
No ratings yet
Test Plan
23 pages
HTSM
No ratings yet
HTSM
5 pages
Parte 2
No ratings yet
Parte 2
20 pages
Lecture 3 - Experiments
No ratings yet
Lecture 3 - Experiments
66 pages
Uxpin Guide To Usability Testing
100% (3)
Uxpin Guide To Usability Testing
102 pages
NPTEL HCI Tut8
No ratings yet
NPTEL HCI Tut8
29 pages
Part 2 2
No ratings yet
Part 2 2
1 page
Test Plan Sprint 3 y 4 en-US
No ratings yet
Test Plan Sprint 3 y 4 en-US
13 pages
Whitepaper Getting Started With Product Experiemtation
No ratings yet
Whitepaper Getting Started With Product Experiemtation
17 pages
D4BC SecondEdition Workbook Current
No ratings yet
D4BC SecondEdition Workbook Current
25 pages
Manual Testing Job Interview Preparation Guide
No ratings yet
Manual Testing Job Interview Preparation Guide
11 pages
A Lightning Introduction to Scrum
From Everand
A Lightning Introduction to Scrum
Mohammed Musthafa Soukath Ali
4.5/5 (3)
Customers Segmentation Using SQL
No ratings yet
Customers Segmentation Using SQL
17 pages
CRM Analytics RFM MODEL
No ratings yet
CRM Analytics RFM MODEL
9 pages
CRM RFM Explaination
No ratings yet
CRM RFM Explaination
36 pages
AB Tetsing
No ratings yet
AB Tetsing
29 pages
Research On Topology Planning For Wireless Mesh Networks Based On Deep Reinforcement Learning
No ratings yet
Research On Topology Planning For Wireless Mesh Networks Based On Deep Reinforcement Learning
6 pages
Pengaruh Penyajian Laporan Keuangan Dan Aksesibilitas TERHADAP TINGKAT AKUNTABILITAS KEU PADA SKPD KAB BENGKALIS
No ratings yet
Pengaruh Penyajian Laporan Keuangan Dan Aksesibilitas TERHADAP TINGKAT AKUNTABILITAS KEU PADA SKPD KAB BENGKALIS
7 pages
Ma8551 Algebra and Number Theory
No ratings yet
Ma8551 Algebra and Number Theory
14 pages
Lesson 2 Introduction of Robot HAT
No ratings yet
Lesson 2 Introduction of Robot HAT
4 pages
Towards One-Shot Learning For Text Classification Using Inductive Logic Programming
No ratings yet
Towards One-Shot Learning For Text Classification Using Inductive Logic Programming
11 pages
ASTM A249 Stainless Steel Tubes
No ratings yet
ASTM A249 Stainless Steel Tubes
10 pages
670 Series 2.0 ANSI: DNP3 Communication Protocol Manual
No ratings yet
670 Series 2.0 ANSI: DNP3 Communication Protocol Manual
74 pages
Sales Force Techleap Ad PDF
No ratings yet
Sales Force Techleap Ad PDF
137 pages
Cummins DGFB Spec Sheet
No ratings yet
Cummins DGFB Spec Sheet
4 pages
Unikl Bmi: Section A: Course Details
No ratings yet
Unikl Bmi: Section A: Course Details
4 pages
Auto Sentry Eau99g 240521 095417
No ratings yet
Auto Sentry Eau99g 240521 095417
75 pages
European Catalog Solenoid Valve Flow Data Asco en 6867432
No ratings yet
European Catalog Solenoid Valve Flow Data Asco en 6867432
8 pages
3.1 Critical Thinking Rubric
No ratings yet
3.1 Critical Thinking Rubric
1 page
Aag Dissertation Research Grants
100% (2)
Aag Dissertation Research Grants
7 pages
Harpoon Lagoon Manual Ice
No ratings yet
Harpoon Lagoon Manual Ice
22 pages
Nuaire ECO4-AE-1Z-STND Installation and Manual
No ratings yet
Nuaire ECO4-AE-1Z-STND Installation and Manual
4 pages
JNCIS-SP Certification - Juniper Networks US
No ratings yet
JNCIS-SP Certification - Juniper Networks US
6 pages
DAA Presentation Greedy Aproch of Coloring
No ratings yet
DAA Presentation Greedy Aproch of Coloring
11 pages
K039-Pic-Checklist For JCB
No ratings yet
K039-Pic-Checklist For JCB
1 page
FVM Crash Intro
No ratings yet
FVM Crash Intro
202 pages
Final Rodent
No ratings yet
Final Rodent
7 pages
Lastexception 63670259058
No ratings yet
Lastexception 63670259058
25 pages
Compliance Under Case-B'.: Notes
No ratings yet
Compliance Under Case-B'.: Notes
10 pages
R23 DBMS Syllabus
No ratings yet
R23 DBMS Syllabus
3 pages
Education: Mechanical Engineer
No ratings yet
Education: Mechanical Engineer
2 pages
Amy Corns - Connecting Scatter Plots and Correlation Coefficients Activity
No ratings yet
Amy Corns - Connecting Scatter Plots and Correlation Coefficients Activity
23 pages
TTL Midterm Reviewer
No ratings yet
TTL Midterm Reviewer
10 pages
Viva Question CSE-376
No ratings yet
Viva Question CSE-376
7 pages
Set A
No ratings yet
Set A
4 pages
Hospital Managemen T System: Oose LAB File
No ratings yet
Hospital Managemen T System: Oose LAB File
62 pages

Facebook Status Colour Change AB Tetsing Case Study

Uploaded by

Facebook Status Colour Change AB Tetsing Case Study

Uploaded by

Product Sense -

Product/Feature Design & Launch Recommendations

Example Cases: A/B testing Use Cases

Judgment Criteria & General Framework -

Keep this in mind when addressing business acumen questions.

How would you test if Facebook incorporating colored backgrounds to statuses

a. What do you hope to achieve with this feature incorporation/update?

a. Set up hypothesis (State the null hypothesis & alternate hypothesis) -

c. Choosing experiment control & treatment subjects

d. Sample Size Calculation

e. Experiment Duration - Based on sample size estimated and the

4. Testing Pitfalls - How to avoid common challenges / experiment bias?

3. Both of these effects are not long term effects, so it’s

Treatment results may get exaggerated/undermined initially

ii. Group Interference Qs -

5. Recommendations based on experiment results - Launch or not?

a. Link results to the goal and business impact

Use Case of A/B Testing across various domains:

E-commerce Checkout Page Button Color:

Email Subject Lines for a Newsletter:

E-commerce Product Page Layout:

Streaming Service Content Recommendation Algorithm:

Q. A streaming service wants to enhance its content recommendation algorithm. They

You might also like