HBR Working Paper
HBR Working Paper
HBR Working Paper
Michael Luca
Harvard Business School
Experimental Evidence⇤
October 1, 2016
Abstract
Paid search has become an increasingly common form of advertising, comprising about half
of all online advertising expenditures. To shed light on the effectiveness of paid search, we design
and analyze a large-scale field experiment on the review platform Yelp.com. The experiment
consists of roughly 18,000 restaurants and 24 million advertising exposures – randomly assigning
paid search advertising packages to more than 7,000 restaurants for a three-month period, with
randomization done at the restaurant level to assess the overall impact of advertisements. We
find that advertising increases a restaurant’s Yelp page views by 25% on average. Advertising
also increases the number of purchase intentions – including getting directions, browsing the
restaurant’s website, and calling the restaurant – by 18%, 9%, and 13% respectively, and raises
the number of reviews by 5%, suggesting that advertising also affects the number of restaurant-
goers. All advertising effects drop to zero immediately after the advertising period. A back of
the envelope calculation suggests that advertising would produce a positive return on average
for restaurants in our sample.
⇤
We thank Susan Athey, Garrett Johnson, Randall Lewis and participants at the NBER Summer Institute for
valuable comments. We thank Yelp, and especially Geoff Donaker, Matt Halprin, Luther Lowe, Travis Brooks, Brian
Dean, and Stephen Lyons for providing support for this experiment. Please contact Daisy Dai ([email protected]) or
Mike Luca ([email protected]) for correspondence.
1
1 Introduction
Internet advertising has been the fastest-growing marketing channel in recent years, accounting
for roughly $60 billion of spending in the United States alone in 2015. The rise of digital advertising
has been dramatic – more than doubling over the past five years alone. Paid search, in which
advertisements are placed alongside search results, comprises the largest share of online advertising
expenditures.
In the offline world, advertising has historically resembled a credence good, where the effective-
ness of a product is taken largely on faith. Even after an advertising campaign is implemented,
limited access to outcome data and exogenous variation in exposure have often prevented credible
estimates of the impact of advertising.
In principle, the digital age provides new opportunities to evaluate the effectiveness of adver-
tisements, enabled by granular data about users and increased feasibility of experimentation. Yet
estimating the effectiveness of digital advertising remains challenging. Correlations in observational
data can yield biased estimates because advertisements are more likely to be shown to people who
are interested and therefore prone to make a purchase even without an advertisement (Johnson,
Lewis and Reilly, 2014; Blake et al. 2015; Gordon et al. 2016). Moreover, large sample sizes are
needed to estimate what are often modest impacts of individual advertising strategies with sufficient
precision (Lewis and Reiley 2014; Lewis and Rao 2015; Athey and Imbens 2016).
To explore the impact of paid search advertising across a broad set of small businesses, we
design and analyze a large-scale field experiment on the popular review website Yelp.com. On Yelp,
a business can purchase standard advertising packages for a fixed rate, which guarantees a minimum
number of advertising impressions to be shown each month. Taking these packages as given, we
experimentally assign standard advertising packages to more than 7,000 restaurants during a three-
month period, sending out roughly 24 million total advertising exposures. We focus on restaurants
that had not actively advertised on Yelp prior to the experiment. We then monitor business-
level outcomes including page views of the the business’s Yelp page (a standard measurement of
advertising effectiveness) as well as three measures that are designed to measure customer intentions
to go to the restaurant – requests for directions, phone calls to the restaurant from Yelp’s mobile
page or mobile app, and clicks on the restaurant’s own URL on its Yelp page.
Overall, we find that advertising increases a restaurant’s Yelp page views by 25%. Within the
2
industry, the number of clicks on a page is widely considered an indication of consumer preferences.
In fact, companies including Yelp, Bing, and Google design advertisements with this objective in
mind. Nonetheless, in principle, users might click on a page that is ultimately a mismatch. To
reinforce our interpretation of clicks as a proxy for demand, we select a sample size that was large
enough to detect effects on our three indicators of purchase intention.
The standard advertising package leads to 10-20% more purchase intentions. This reinforces
our interpretation that Yelp advertisements lead to increases in demand. However, the increase in
conversions is smaller than the increase in page visits. This implies that the marginal visitors are
different from the average visitor occurring through an organic click.
As a final indicator of demand, we consider the number of reviews that are left during the
advertising period. In principle, if more Yelp users are actually going to the restaurant, then one
would expect to see an increase in the number of reviews being left. Consistent with this, we find
that advertising leads to a 5.4% increase in the number of reviews left for a restaurant in a given
month. However, the effects of advertising drop to zero immediately after the advertising period,
suggesting that advertising temporarily raises awareness of businesses that users would otherwise
not discover. A back of the envelope calculation suggests that advertising leads to an 8% increase
in revenue, and would produce a positive return, on average, within our sample.
Our findings contribute to the literature on the effectiveness of online advertising. Most online
advertising studies have focused on experiments run by a single large company, enabled by the large
consumer base that each of these companies has access to. For example, Blake et al. (2015) and
Avery et al. (2016) show that digital advertisements (on Bing and Facebook, respectively) have very
little effect on demand for eBay and the College Board. In contrast, we show large average effects
across a broad swath of potential advertisers on Yelp. These restaurants are much smaller and
less well known than eBay or the College Board and hence may benefit more from advertisements.
More generally, our results demonstrate the potential of sponsored search to drive outcomes – even
among businesses that have opted not to advertise. Whereas Blake et al. (2015) conclude that many
well-branded advertisers might be better off by not advertising, our results raise the possibility that
many less-branded non-advertisers might be better off by advertising.
3
2 Experiment and Data
The experiment was conducted on the popular review platform Yelp.com. Yelp hosts user-
generated reviews of local businesses around the world and provides search functions in browsers as
well as its own mobile app. As of 2015, Yelp had roughly 163 million unique visitors monthly. We
focus on the US restaurant industry in our experiment.
The goal of the experiment is to examine the effect of the search advertising package currently
provided by Yelp. Businesses pay a fix monthly fee for the advertising package and are guaranteed
a minimum number of search appearances. Yelp manages and automates the paid search bidding
for the advertised businesses. Figure 1 shows examples of paid search appearances.
To evaluate advertising effectiveness, we randomly select businesses to be given advertising
packages for three months from August 1, 2015, to October 31, 2015. In choosing the sample, we
only include businesses that are eligible to purchase search advertising and are actively operating
businesses with basic verified listing information, including photos and a minimum number of trusted
reviews with good ratings.
We use stratified sampling in the randomization process. Stratified randomization that random-
izes treatment within groups of businesses with similar characteristics increases statistical power
(Athey and Imbens 2016; Imbens and Rubin 2015). The strata are described in the following. The
first stratum includes Yelp Reservations restaurants. Yelp Reservations is an online reservation sys-
tem hosted by Yelp that allows consumers to directly make reservations using the simple booking
tool on the restaurant’s Yelp page. The second stratum includes restaurants that are clients of
OpenTable, an Internet booking platform external to Yelp. We use a separate stratum for these
restaurants since they are similar and are usually mid-range to high-end full-service restaurants. The
third stratum includes restaurants that partner with Yelp’s delivery service, EAT24. The fourth
stratum includes restaurants in Washington State (because we have done prior research focusing
on this state) that do not use OpenTable and are not Yelp partners. These restaurants have less
online prominence on average. As shown in Panel B of Figure 3, stratum 4 has fewer reviews than
restaurants in other strata. The number of treated restaurants in each stratum is shown in Table
1. We also restrict the proportion of restaurants treated in the Yelp-defined marketing area to be
smaller than 10%. In principle, we can observe two additional metrics beyond the ones in this anal-
ysis: reservations for Yelp Reservations restaurants and deliveries for EAT24 restaurants. However,
4
a limited number of restaurants use each of these services, preventing us from obtaining conclusive
estimates on those two variables.
As shown in Table 1, the final experiment sample includes 18,295, and 7,210 restaurants are
given free advertising packages. The experiment started on August 1, 2015, and ended on October
31, 2015. To avoid Hawthorne effects, restaurants are not told about the advertising packages and
are removed from the advertising sales lists.
Randomization is done at the business level. We randomize at the business, rather than user,
level for two reasons. First, users who are randomly exposed to ads may generate purchases and
subsequently reviews (which we did observe in the results), and therefore might influence consumers
who are not exposed to ads, generating spillover. Second, we were interested in the overall impact
of the standard advertising package on outcomes for a business.
The key metric we use to assess advertising effectiveness is the number of page views of a
restaurant’s Yelp page, a standard measure in the advertising industry. In addition, we collect
conversion metrics to measure consumers’ purchase intent – map queries on Yelp, calls following
consumers’ clicks on a restaurant’s phone number on Yelp’s mobile page or app, and clicks on a
restaurant’s own website link on its Yelp page. As an indicator of consumer demand, we also consider
reviews left during the advertising period. Finally, for Yelp Reservations restaurants, we observe the
number of reservations made through Yelp. However, operational limitations within the company
prevented us from having a large enough sample to detect meaningful effects on reservations (which
we determined going into the experiment, but could not change).
One advantage of our experiment is that we are able to randomize among a large set of rep-
resentative restaurants in the market and hence obtain the average treatment effect for businesses
typically unable to run their own experiments, and allowing a contrast to some of the large ex-
periments run by individual advertisers. While we present unweighted results, we can reweight to
obtain treatment effects for different populations.
3 Conceptual Framework
5
where i is a restaurant and t is a month. We sample the periods between January 2015 and
December 2015. AdsOnit is a dummy variable that equals to one for a treated restaurant during
the experimental period (8/1/2015-10/31/2015). T reatedP ostExprit is an indicator variable that
equals one for a treated restaurant during the post-experiment period (11/1/2015-12/31/2015). In
the baseline specification, we also control for business and month fixed effects. The parameters of
interest are 1 and 2, which represent the average treatment effects during and after the experiment
(advertising) period.
All results are reported in percent changes to protect level variables that are sensitive within
the industry. More specifically, we first calculate the percentage as the ratio of the advertising
effect estimated in equation 1 and the control (non-advertising) group average during the treatment
period within each strata, and then we take the weighted average across strata.1 To calculate
the statistical inference for the average percentage gain, we need to note the correlation between
treatment effects and control group outcomes in the ratio calculation as well as the correlation
of ratios across different strata. We derive the confidence interval of the ratio by bootstrapping.
Significance are qualitatively unchanged for level estimates.
We check other choices of specifications for robustness. Using the linear regression framework,
we drop the business level fixed effect µi and replace it simply by the treatment indicator, and the
results are unaffected. We also run the generalized linear regression with Poisson link and we get
similar results. For alternative hypothesis testing methods of advertising effects, we run the Fisher
exact test to confirm that we can reject the null at each equation level (Athey and Imbens 2016,
Young 2016).
We can also view outcomes as resulting from different depths of search, as we expect the ad-
vertising effect on consumer activity to decline with each additional step in the search process. For
example, the consumer can only click on a restaurant’s own website link after visiting the restaurant’s
Yelp page. Hence, the probability of observing a visit to a restaurant’s website can be written as
P (U RLClick) = P (U RLClick|P ageV iew) ⇥ P (P ageV iew|Appearance) ⇥ P (Appearance), where
appearance is the appearance of the restaurant in the search result. We call the first conditional
probability “conversion rate” and the second one “click-through rate.” Paid search advertising can
increase the number of restaurant URL clicks (i.e. visits to the restaurant’s own website) by in-
1
In the baseline results in Table 2, we use the weight as simply the sampling weight across strata.
6
creasing the chance that the restaurant is seen by consumers (search appearance), but the impact
on URL clicks depends on both click-through rate and conversion rate. If the conversion rate for
paid clicks is the same as for baseline organic clicks, we should see the same percentage increase in
URL clicks and page views. However, the number of URL clicks may increase less if marginal users
arriving through advertisements are less likely to click on the URL relative to average consumers
viewing the Yelp page.
4 Results
The effects of paid search advertising across all outcomes are presented in Table 2, and the effects
on key outcomes are plotted in Figure 4. Overall, we find that advertising increases a restaurant’s
Yelp page views by 24.6%, and the increase is greater for mobile views (30.2%) than for web browser
views (21.5%). Across the three conversion metrics, advertising led to an increase in map inquiries
by 17.7%, in calls by 12.6%, and in clicks on restaurants’ own URLs by 8.8%. This suggests that
Yelp advertisements lead to increases in demand.
These results demonstrate that advertisements on Yelp are effective at increasing a variety of
metrics indicating consumer intentions to visit the restaurant. We also find that the increase in
conversion metrics is smaller than the increase in page views, suggesting that the marginal visitor
driven from advertising is less likely than an average visitor to convert to a purchase.
As a final indicator of demand, we examine the number of reviews. We find an average 5.4%
increase in the number of reviews due to advertising. However, while advertising generates more
reviews, there is no significant change to the average rating, the review length, or the percent of
reviews that are removed by Yelp’s filter.
The temporal effect of advertising is shown in the post-experiment estimates column in Table 2.
Overall, we find that the effects of advertising disappear immediately after the advertising period.
When estimating advertising effects for the first, second, and third advertising month separately,
we do not find any evidence that how long the business has been advertising matters. These can
also be seen in Figure 2, which shows the daily total page views for control and treated restaurants
in two strata (EAT24 and Washington state restaurants) during our sample period. In this figures,
we see that the gain in total page views stays roughly constant across three months, and the gain
disappears immediately after the experiment. These findings suggest that advertising provides
7
information about a business that users would otherwise not discover.
5 Discussion
Overall, our results shed light on the effectiveness of paid search advertising for small businesses.
To our knowledge, this is the largest scale advertising effectiveness study in terms of the number
of businesses involved. For the types of small businesses that are common on Yelp and similar
platforms, we show that advertising does indeed have a large impact. Comparing effects on page
views, three consumer intent measures, and the number of reviews, we find lower effects on purchase
intent and purchase indicator than on page views. This highlights the fact that the marginal
consumers acquired from the paid search advertising are systematically different from the average
person viewing the page. While we lack the volume of sales information required to directly estimate
the impact of advertising on sales revenue, we obtain sales revenue for a sample of restaurants in
Washington state. Looking at the relationship between changes in revenue and changes in page
views, we estimate the impact of advertising on sales using this proxy. As we show below, the
return of paid search advertising is positive, on average, for our sample of restaurants - which
consists of restaurants that were not advertising on Yelp, and one might expect a larger return for
businesses that are advertising.
To estimate the return on advertising, we need an estimate of the dollar value of marginal page views.
To do so, we matched a subset of Washington state restaurants (totaling 835 restaurants) back to
tax records, which we obtained for the first half of 2015 (the period just before our experiment) from
the Washington State Department of Revenue. While we do not have enough data on sales revenue
for restaurants in our sample to use this as a dependent variable, we can look at the relationship
between changes in sales and changes in page views in the sample, which we use as a proxy for the
change in revenue that would be associated with a change in page views generated by advertising.
Specifically, we conduct the following regression to obtain the effect of changes in page views on
changes in revenue.
log(revenueit ) = log(pageviewit ) + ↵i + t + ✏t
8
where i is the indicator for a restaurant and t the indicator for a quarter. We add restaurant
fixed effects to examine within restaurant variations and we add quarterly dummies to control for
common time trends. The estimate for is 32.54% significant at 1% level with standard error
clustered at the business level. This means a 10% increase in the total number of quarterly page
views leads to a 3.3% increase in quarterly revenue. Advertising leads to an average increase of
24.6% in total page views according to our experiment, and hence an increase of roughly 8% in
revenue. For example, for a business with $96,000 in quarterly sales (the median in Washington
state), and given a marginal profit margin for additional sales of roughly around 70%,2 the return on
advertising would be 446% 3 . An important limitation of this calculation is the fact that marginal
clicks deriving from advertising may have differential impacts relative to marginal clicks deriving
from sales. Nonetheless, this suggests that paid search advertising can be a profitable investment
for small businesses, even among ones that are not currently advertising.
References
[1] Daniel A Ackerberg. Empirically distinguishing informative and prestige effects of advertising.
RAND Journal of Economics, pages 316–333, 2001.
[2] Susan Athey and Guido Imbens. The econometrics of randomized experiments. arXiv preprint
arXiv:1607.00698, 2016.
[3] Kyle Bagwell. The economic analysis of advertising. Handbook of industrial organization,
3:1701–1844, 2007.
[4] Thomas Blake, Chris Nosko, and Steven Tadelis. Consumer heterogeneity and paid search
effectiveness: A large-scale field experiment. Econometrica, 83(1):155–174, 2015.
[5] Tat Y Chan, Chunhua Wu, and Ying Xie. Measuring the lifetime value of customers acquired
from google search advertising. Marketing Science, 30(5):837–850, 2011.
[6] J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Taylor & Francis, 2013.
[7] Benjamin Edelman. The design of online advertising markets. Handbook of Market Design,
2010.
[8] Avi Goldfarb. What is different about online advertising? Review of Industrial Organization,
44(2):115–129, 2014.
2
According to accounting firm Baker Tilly (http://www.bakertilly.com/uploads/restaurant-benchmarking.pdf),
the variable cost (food cost, or in other words, the cost of goods sold) is roughly 28-32% of total sales.
3
Gain from advertising is ($96, 000 ⇥ 7.97% ⇥ 70% =)$5356 per quarter, and the cost of advertising is $1,200 per
quarter.
9
[9] Avi Goldfarb and Catherine Tucker. Online display advertising: Targeting and obtrusiveness.
Marketing Science, 30(3):389–404, 2011.
[10] Brett Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. A comparison of
approaches to advertising measurement: Evidence from big field experiments at facebook.
White paper, 2016.
[11] Guido W Imbens and Donald B Rubin. Causal inference in statistics, social, and biomedical
sciences. Cambridge University Press, 2015.
[12] Garrett A Johnson, Randall A Lewis, and Elmar I Nubbemeyer. The online display ad effec-
tiveness funnel & carry-over: A meta-study of ghost ad experiments. Working Paper, 2015.
[13] Garrett A Johnson, Randall A Lewis, and David Reiley. Location, location, location: repetition
and proximity increase advertising effectiveness. Available at SSRN 2268215, 2014.
[14] Soohyung Lee and Azeem M. Shaikh. Multiple testing and heterogeneous treatment effects:
Re-evaluating the effect of progresa on school enrollment. Journal of Applied Econometrics,
29(4):612–626, 2014.
[15] Randall A Lewis and Justin M Rao. The unfavorable economics of measuring the returns to
advertising. The Quarterly Journal of Economics, page qjv023, 2015.
[16] Randall A. Lewis, Justin M. Rao, and David H. Reiley. Here, there, and everywhere: Correlated
online behaviors can lead to overestimates of the effects of advertising. In Proceedings of the
20th International Conference on World Wide Web, WWW ’11, pages 157–166, New York, NY,
USA, 2011. ACM.
[17] Randall A Lewis and David H Reiley. Online ads and offline sales: measuring the effect of retail
advertising via a controlled experiment on yahoo! Quantitative Marketing and Economics,
12(3):235–266, 2014.
[18] John A. List, Azeem M. Shaikh, and Yang Xu. Multiple hypothesis testing in experimental
economics. Working Paper 21875, National Bureau of Economic Research, January 2016.
[19] Puneet Manchanda, Jean-Pierre Dubé, Khim Yong Goh, and Pradeep K Chintagunta. The
effect of banner advertising on internet purchasing. Journal of Marketing Research, 43(1):98–
108, 2006.
[20] K.R. Murphy, B. Myors, and A. Wolach. Statistical Power Analysis: A Simple and General
Model for Traditional and Modern Hypothesis Tests, Fourth Edition. EBL-Schweitzer. Taylor
& Francis, 2014.
[21] Sridhar Narayanan and Kirthi Kalyanam. Position effects in search advertising: A regression
discontinuity approach. Technical report, Tech. rep., Stanford University, 2014.
[22] David H Reiley, Sai-Ming Li, and Randall A Lewis. Northern exposure: A field experiment
measuring externalities between search advertisements. In Proceedings of the 11th ACM con-
ference on Electronic commerce, pages 297–304. ACM, 2010.
[23] Oliver J Rutz and Randolph E Bucklin. From generic to branded: A model of spillover in paid
search advertising. Journal of Marketing Research, 48(1):87–102, 2011.
10
[24] Navdeep S Sahni. Advertising spillovers: Evidence from online field-experiments and implica-
tions for returns on advertising. Journal of Marketing Research, 2016.
[25] Navdeep S Sahni and Harikesh Nair. Does advertising serve as a signal? evidence from field
experiments in mobile search. Working Paper (January 14, 2016), 2016.
[26] Sha Yang and Anindya Ghose. Analyzing the relationship between organic and sponsored search
advertising: Positive, negative, or zero interdependence? Marketing Science, 29(4):602–623,
2010.
[27] Song Yao and Carl F Mela. A dynamic model of sponsored search advertising. Marketing
Science, 30(3):447–468, 2011.
[28] Alwyn Young. Channeling fisher: Randomization tests and the statistical insignificance of
seemingly significant experimental results. Technical report, Technical Report, Working paper,
2015.
11
Figures and Tables
Notes: 1 The above figures are snapshots following a search session for “Vegetarian” near Marina, San Francisco,
CA. The advertised restaurant is showed on the top of the page, marked by “Ad.” The advertised restaurant is
highlighted in yellow in the map, distinguished from other restaurants. 2 Also note that the search engine allows
consumers to easily filter or identify restaurants that offer reservations through Yelp (Yelp Reservations restaurants)
and restaurants that take delivery orders through Yelp (EAT24 restaurants).
12
Figure 2: Effect of Paid Search Ads on Page Views
13
Notes: 1 The figures plot the daily average page views (clicks on the restaurant’s Yelp page) of restaurants treated with free paid search ads and those that were
not between 01/01/2015 and 01/06/2016. 2 The two solid vertical lines indicates the start and end of the experiment period. 3 The two figures use the same
scale. The left figure is based on a sample of restaurants that are partners of Yelp’s delivery service, EAT24, and the right one is based on a sample of non-partner
restaurants in Washington State.
Figure 3: Histogram of Restaurant Ratings Before the Experiment
14
Figure 4: Average Effect of Paid Search Advertising
Notes: 1 The figure plots the average effects of paid search advertising and the confidence intervals of the estimates.
2 The estimates and confidence intervals are reported in Table 2. 3 The outcomes are page views (number of clicks
on the restaurant’s Yelp page), map clicks (number of times of the restaurant’s map is queried), calls made to the
restaurant through a mobile phone, clicks on the restaurant’s URL link, and number of trusted reviews.
15
Table 1: Experiment Sample Strata
16
Table 2: Average Effects of Paid Search Advertising
[95% confidence interval]
Experiment-Period Post-experiment-period
Page Views
on Web 21.5% 0.8%
[19.7%, 23.3%] [-1.3%, 2.9%]
on Mobile 30.2% 1.9%
[28.3%, 32.0%] [-0.3%, 4.1%]
Total 24.6% 1.2%
[22.9%, 26.4%] [-0.8%, 3.2%]
Map Inquiries
on Web 18.7% 2.2%
[16.3%, 21.0%] [-0.8%, 5.0%]
on Mobile 16.5% 3.6%
[12.3%, 20.1%] [0.9%, 6.1%]
Total 17.7% 2.8%
[15.1%, 20.0%] [0.2%, 5.2%]
Other Conversions
BusinessURL Clicks 8.8% -0.7%
[5.9%, 11.9%] [-4.3%, 2.5%]
Calls to Businesses 12.6% 0.6%
[8.8%, 16.2%] [-3.1%, 4.1%]
Check-Ins & Reviews
Check-Ins 1.0% 1.7%
[-1.0%, 3.1%] [-1.7%, 5.0%]
# of Trusted Reviews 5.4% 1.5%
[3.2%, 7.5%] [-1.5%, 4.2%]
% of Trusted Reviews -0.3% 0.4%
[-1.2%, 0.4%] [-0.6%, 1.3%]
Ratings -0.3% 0.2%
[-1.3%, 0.7%] [-1.2%, 1.5%]
Review Length -1.2% 1.7%
[-3.2%, 0.8%] [-0.9%, 4.2%]
95% confidence interval in parentheses obtained from bootstrap.
Notes: 1 Table 2 reports the effects of advertising calculated by the method described in Section 3. 2 The observa-
tions we obtained are between 1/1/2015 and 12/31/2015, and the experiment is conducted between 8/1/2015 and
10/31/2015.
17