Facebook Status Colour Change AB Tetsing Case Study
Facebook Status Colour Change AB Tetsing Case Study
Lecture Objective:
● Business sense towards A/B testing.
● How to launch A/B testing to get “valid” and “reliable” results.
○ Developing hypothesis
○ Designing A/B tests
○ Evaluating test results
○ Making decisions
● Launch recommendations and pitfalls with A/B testing set ups.
● Design an experiment:
○ Structure
○ Right Flow of Steps
○ Develop the right Hypothesis
○ Sample Size Calculation
○ Duration
○ Pitfalls
● Identifying the right metric for testing significance - North Star metric
● Sample Segmentation for valid results - How would you take care of nuances
and sample bias?
● How conclusively will the candidate be able to make a decision based on A/B
test results?
Case Study:
2. Discuss the metrics that feature expects to bring an impact to and the data
available.
a. Give a list of metrics and finalize the metric we want to test significance
on.
i. Metrics - In this case study we can focus on the primary impact of
the feature being more user engagement, daily active users etc.
ii. User Engagement can be defined as % of active users who have
engaged with Facebook in some way (like, comments, save,
reactions).
iii. Daily Active Users - # of unique users who have logged on
Facebook each day. We expect this metric to increase with this
new feature
b. Come up with north star metrics, supporting metrics (if applicable) and
guard rail metrics.
i. North Star Metric - % of user with engagement
ii. Supporting Metric - Daily Active Users
iii. Guard Rail Metric (This is the metric that should not degrade in
pursuit of a new feature) - % of media content (assuming media
content provides more value, we don't want this % to decrease
because of this feature). Or because this feature takes up so much
space, are we seeing lesser number of posts on average that
people interact with
3. Experimentation - How to design an experiment?
b. Choice of test - Since we are comparing two ratios, we can use the
Z-proportions test.
4. Solutions:
a. Run the experiment for a longer time than required if
possible to observe for any novelty or primacy effect.
b. The test can be conducted only on the first time
users.
c. Compare first time users with experienced users in
the treatment group (we can get an estimated impact
of primacy / novelty effect).
b. Outcome Bias
i. Look out for other design or system issues that led to the actual
effect being undermined or over estimated to the treatment effect.
Instructor Note:
● Next we have some quizzes based on different use cases of A/B Testing.
● Launch the quiz and then ask the learners to share their views on goals of each
of these, to make it more engaging and to help them understand better.
______________________________________________________________________________
What could be the goal of this experiment? Choose the most appropriate answer.
A. The goal is to determine which color increases the click-through rate and
leads to more purchases.
B. The goal is to assess whether changing the button color impacts customer
satisfaction with the website design.
C. The goal is to analyze the impact of button color on the average time spent on
the checkout page.
______________________________________________________________________________
Q. A media company is testing different subject lines for its weekly newsletter. They
send two versions of the newsletter, each with a different subject line, to a subset of
their subscribers.
What would be the most appropriate action item for the media company as per the
ongoing experiment?
A. They conduct a survey to gather feedback from subscribers about the subject
line for their respective group.
B. They change the font style & color among the two versions of the newsletter to
see if it has any impact on subscriber interaction.
C. They increase the frequency of newsletter distribution for one group to see if
more frequent newsletters result in higher engagement.
D. They measure open rates to determine which subject line is more effective at
capturing reader attention.
______________________________________________________________________________
What would be the most appropriate choice of next step in the currently ongoing
experiment?
A. They analyze the number of social media shares for each product to understand
the impact of the layout variations on brand popularity.
B. The company measures the click-through rate and conversion rate to
determine which layout leads to more sales.
C. They conduct a survey to gather customer preferences on the more appealing
layout of the product pages of the website.
______________________________________________________________________________
What would be the most appropriate choice of next step in the currently ongoing
experiment?
A. They analyze the number of products recommended during the testing period to
assess the impact of the algorithm changes.
B. They monitor the recommendations generated by the new algorithm and
compare them with the recommendations from their rival platforms.
C. They compare user engagement metrics, such as time spent watching
content and user ratings among the two algorithms to decide which one is to
be kept as default for the entire user base.
D. They analyze the variety of categories of products recommended during the
testing period to assess the impact of the algorithm changes.
______________________________________________________________________________