FR Oestreicher-Singer, G. and L. Zalmanson Content or Community A Digital
FR Oestreicher-Singer, G. and L. Zalmanson Content or Community A Digital
FR Oestreicher-Singer, G. and L. Zalmanson Content or Community A Digital
net/publication/228321578
CITATIONS READS
119 692
2 authors:
Some of the authors of this publication are also working on these related projects:
Deep into the Funnel? Predicting Online Conversion Using Search Diversity View project
All content following this page was uploaded by Lior Zalmanson on 21 July 2020.
The content industry has been undergoing a tremendous transformation in the last two decades. We focus in
this paper on recent changes in the form of social computing. Although the content industry has implemented
social computing to a large extent, it has done so from a techno-centric approach in which social features are
viewed as complementary rather than integral to content. This approach does not capitalize on users’ social
behavior in the website and does not answer the content industry’s need to elicit payment from consumers. We
suggest that both of these objectives can be achieved by acknowledging the fusion between content and
community, making the social experience central to the content website’s digital business strategy.
We use data from Last.fm, a site offering both music consumption and online community features. The basic
use of Last.fm is free, and premium services are provided for a fixed monthly subscription fee. Although the
premium services on Last.fm are aimed primarily at improving the content consumption experience, we find
that willingness to pay for premium services is strongly associated with the level of community participation
of the user.
Drawing from the literature on levels of participation in online communities, we show that consumers’
willingness to pay increases as they climb the so-called “ladder of participation” on the website. Moreover,
we find that willingness to pay is more strongly linked to community participation than to the volume of content
consumption. We control for self-selection bias by using propensity score matching. We extend our results
by estimating a hazard model to study the effect of community activity on the time between joining the website
and the subscription decision. Our results suggest that firms whose digital business models remain viable in
a world of “freemium” will be those that take a strategic rather than techno-centric view of social media, that
integrate social media into the consumption and purchase experience rather than use it merely as a substitute
for offline soft marketing. We provide new evidence of the importance of fusing social computing with content
delivery and, in the process, lay a foundation for a broader strategic path for the digital content industry in an
age of growing user participation.
Keywords: Premium services, social media, online communities, propensity score matching, UGC, digital
business strategy, ladder of participation1
1
Anandhi Bharadwaj, Omar A. El Sawy, Paul A. Pavlou, and N. Venkatraman served as the senior editors for this special issue and were responsible for accepting
this paper.
Introduction that firms fail to reap tangible benefits from social computing
because they have largely implemented social features using
Rapid technology changes over the past two decades have a techno-centric approach rather than a strategic one: They
presented the content industry with a vast number of oppor- view social computing features as add-ons to traditional
tunities as well as new challenges. The relative ease of content. These implementations are useful, but they miss the
digitizing text, music, and video, coupled with the ubiquity of broader promise of social computing for content websites, one
content consumption technologies such as personal computers that can only be realized by taking a strategic approach.
and MP3 players, have encouraged content providers to rely
increasingly on electronic offerings, thereby reducing their In this paper we suggest a first step toward this strategic
production and operational costs considerably. Correspon- approach. We rely on the notion of a “fusion view of IT” (El
dingly, distribution costs have been lowered by the process of Sawy 2003) and contend that social computing should no
net-enablement—a term that refers to the incorporation of longer act as a merchandising complement to the firm’s value
digital networks into content delivery, management, and proposition; it is not a technological enhancement to the
marketing (Straub and Watson 2001; Wheeler 2002). Indeed, product, nor is it simply an innovative marketing tool. Rather,
consumers have largely switched to online consumption of it is an inherent part of the firm’s product, the core of its
content: Americans bought 1.27 billion digital tracks online digital business strategy. The adoption of this digital business
in 2011, which accounted for more than half of all music sales strategy and its value proposition transforms the main role of
(Nielsen 2012); in 2010, 59 percent of Americans consumed the firm from providing content to establishing content-related
online news on a regular basis (Pew Research Center 2010). and IT-enabled social experiences for its users (we refer to
However, at the same time, these changes have also lowered these experiences as social content). This paradigm is con-
switching costs for consumers, increased piracy, and created structed to reflect users’ changing expectations in a “social”
masses of new free content, raising a need for content pro- age characterized by the rise of social media platforms and
viders to adopt new strategic thinking in order to sustain corresponding changes in online behavior.3
competitive advantage.
We continue and add to the literature on digital business
The recent widespread adoption of social computing marks strategy by focusing on the “ladder of participation” para-
another dramatic step in the evolution of the online content digm—a user segmentation scheme based on the evolving
industry.2 Social computing technologies are giving rise to nature of participatory behavior in online communities. We
new forms of user interaction and cocreation; their implica- believe this to be the key difference between social computing
tions for business strategy have yet to be fully elucidated. In and other net-enabled technologies. We discuss the value that
particular, researchers and practitioners must cope with issues users at different participatory levels derive from social con-
such as “how business can generate value through social net- tent websites. We then define how value is captured: We
works, how communities in these initiatives can gain value, conjecture that business models that cater to different parti-
and how to assess the costs and benefits of social computing cipatory levels in the ladder are inherently well suited to
initiatives” (Parameswaran and Whinston 2007, p. 344). social content websites, as they offer the firm a method to
capitalize on users’ evolving commitment to the firm’s
The growing literature on digital business strategy has offerings. We focus on the “freemium” model, in which a
stressed the transformational role that IT plays in contem- website offers most of its services for free while restricting
porary business processes (Banker et al. 2006; El Sawy 2003; only some premium features to fee-paying consumers, and
Lu and Ramamurthy 2011; Sambamurthy et al. 2003). In the discuss how this model is suitable for capturing the value
same spirit, we assert that social computing can be a trans- created in socially active users.
formational force for the content industry. Initially perceived
as a threat and as a potential source of piracy and disinter- Next, we empirically demonstrate the relations between social
mediation, many incumbents now regard social offerings as content consumption, users’ participation patterns, and their
complementary to content and as an integral part of content willingness to pay, using data from Last.fm. Last.fm is a
delivery strategy (Adams 2011). Nevertheless, the benefits of proprietary content website that serves both as an online radio
social computing features are still under debate, and the and as a social networking site and operates under the
industry often questions their potential. It is our conjecture
3
Yoo (2010) discusses “digital natives,” users who spend most of their lives
2
Social computing is a collective name for IT technologies that facilitate surrounded by computers, mobile devices, and video games. For them, tech-
collective action and social interaction online (Parameswaran and Whinston nological acceptance is taken for granted and questions of digitization seem
2007). irrelevant.
freemium business model. Even though the premium services nearly impossible to disentangle business and social processes
it offers mainly improve the proprietary-content consumption from their underlying IT infrastructures. Moreover, social
experience (for example, by increasing bandwidth), we find computing creates a shift in which computing power is
that acts of payment are strongly associated with social transferred from organizations to individuals, empowering
computing-based features. Specifically, consumers who parti- users with relatively low technological sophistication to use
cipate in the community (i.e., use features that enable them to the web to “manifest their creativity, engage in social inter-
contribute to the community) show a higher propensity to pay action, contribute their expertise, share content, collectively
compared with users who do not use social computing build new tools and disseminate information” (Parameswaran
features. Users who act as leaders in this community show and Whinston 2007, p. 753).
even higher propensity to pay.
The content industry in particular has been transformed by
Our results underline the importance of viewing social social computing technologies. The evolution of the content
computing as being integral to the product strategy of content industry’s view on social computing can be understood in
providers. We provide new and unique evidence of the light of El Sawy’s (2003) three phases of IT strategy:
importance of introducing social computing as a means of connection, immersion, and fusion. Table 1 summarizes the
both value creation and capture and as a source of competitive three phases of IT in the context of social computing in the
advantage for content providers, leading to insights into the content industry. Until recently, many content providers were
causal effects of social engagement on consumers’ buying in the so-called connection phase of IT strategy, perceiving
decisions, and laying the foundation for a broader strategic the web only as an additional channel for traditional
path that content providers can follow in an age of growing newspaper, radio and television content offerings (O’Reilly
user participation. 2005). As such, these providers viewed social computing as
a source of competition and therefore as a threat. Their per-
ception led them to adopt strategies that emphasized quality
and the trustworthy attributes of traditional and proprietary
The Fusion of Social Computing content over new alternatives (Posner 2005; Neuberger and
and Content Nuernbergk 2010).
In the past, most information technologies adopted by organi- With the immense success of social computing, many content
zations were perceived as tools added to boost productivity or providers entered the immersion phase of IT strategy (El
lower operational costs. IT was connected to people’s work Sawy 2003; Table 1), embracing social computing features as
as an artifact that could be pushed aside if necessary, while a means of stimulating consumption of traditional content (for
work might still continue (El Sawy 2003). Consequently, the example, by allowing word of mouth) and prolonging users’
broad strategic view was that IT strategy must be aligned with length of stay on their websites. Users were encouraged to
the firm’s business strategy (Henderson and Venkatraman actively engage with the content and with one another, by
1993). However, during the last two decades, the digital posting comments, conversing on user forums, and sharing
infrastructure of business and society has shifted radically, user-generated content, either within the websites themselves
and researchers and managers alike have acknowledged that or through existing popular social computing platforms
the role of IT has undergone a transformation. IT has become (Clemons 2009; e.g., many websites feature buttons, such as
immersed in the workspace and in homes, developing into an Facebook’s “Connect” button and Google Plus’s “+1” button,
unavoidable part of both daily routines and business pro- which enable users to share content with friends on the corre-
cesses. A newer view has been suggested in which tech- sponding social computing platforms). Nevertheless, this
nology is not only immersed in the business environment but positive approach toward social computing still puts the
is fused with it, such that IT and business strategy are emphasis on the content offered rather than on the social
indistinguishable to our perception and form a unified fabric experience (i.e., firms still perceive social computing features
(El Sawy 2003). This change has called attention to the need as complementary rather than as integral to the firm’s
for firms to develop digital business strategies, an approach offerings). For example, the New York Times website
that takes into account the embeddedness of technology in the (NYTimes.com) now includes social computing features that
business process and daily lives. allow users to comment on articles, rate them, and share them
via social networking sites. However, in essence, the site still
Social computing technologies are a recent example of tech- functions as a traditional newspaper—the user’s fundamental
nology that has become deeply embedded in our daily experience of the site revolves around the consumption of
routines and personal interactions, to the point where it is proprietary news content, presented in accordance with the
Table 1. Three Phases of IT in the Context of Social Computing in the Content Industry
Broader Industry View on IT
Phase (Based on El Sawy 2003) Content Industry View on Social Computing
Connection IT is used as a tool to help people in Social Computing is a mere tool and its use is
their work. It is a separable artifact that optional.
can be connected to people’s work Many ignore social computing altogether, and some
actions and behaviors perceive it a threat and promote against it.
Immersion IT is immersed as part of the business Social Computing is a valuable complementary offer.
environment and cannot be separated Social Computing platforms are being widely used to
from work and the systemic properties of attract users and differentiate websites from their com-
interorganizational relationships. petitors. Practically, social features are add-ons next to
traditional content, which is still the main focus of the offer.
Fusion IT is not only immersed but is fused with Social Computing cannot be differentiated from the
the business environment such that they content experience.
are indistinguishable to our perception Content is inherently a social experience. Content
and form a unified fabric. providers create social experiences in which the user
creates a personal online identity and interacts with
others. This social experience takes center stage on the
website, replacing content.
vision of the editors or site administrators, and this experience industry is to build an arena in which social interaction can
remains more or less consistent regardless of the presence of occur. To do so, content providers must supply users with
social features. social experiences based on shared content, not merely add a
“social” layer to their traditional offerings. This implies that
A different approach for social computing in the content users should be able to interact with fellow users through the
industry is based on the IT fusion view proposed by El Sawy website, not merely interact with content itself. This ap-
(2003; Table 1). According to this view, content websites are proach is user-centric, positioning the users’ personal experi-
inherently social, and as such they cannot separate content ence rather than the content itself at the core of the online
from social computing elements. Indeed, many content- product. It creates a shift in the role of the content industry,
related experiences are, at their core, social. People derive rendering it an enabler of experiences rather than a mere
great pleasure from watching films together with friends, provider of content. The result is a hybrid between “content
attending concerts in groups, and discussing news articles and provider” and “virtual community” business models (Weill
texts in organized “knitting circles” or other informal and Vitale 2001) that can be referred to as a social content
gatherings. Thus, the offer made by the website should website. This use of social computing as an inherent part of
emphasize the social aspect of content consumption: meaning the value proposition is unique to the IT fusion phase
the creation and enhancement of relationships. discussed above. Table 2 provides a detailed comparison of
the content website characteristics under each of the three
In recent years we have seen the rise of many sites that have phases of IT discussed (connection, immersion, and fusion).
been described as social media platforms, such as Facebook,
Digg, and LiveJournal, among others. These platforms enable As content consumption becomes a social experience, value
users to create an on-site identity (in many cases, by de- creation becomes dependent on the social environment as
signing a personal page), make online friends, curate content well. For example, users browsing the website can be in-
for others to enjoy, attend virtual social events, participate in formed in real time who is consuming which content or how
social games, create collaborative user-generated content and popular different content items are. Similarly, enabling
build ongoing reputations. Social media platforms have ratings and comments allows users to influence the navigation
understood that consuming content, and forming relationships and consumption decisions of other users. Clearly, providing
around it by discussing, sharing, and reacting to it, are parts a platform in which users can organize discussions around
of the same experience. different topics will inform other users, and allowing users to
moderate content can improve content quality. By con-
The content industry, in contrast, has not fully grasped the structing an array of value-creating features based on social
implications of this reality. The next step for the content computing, firms can encourage user participation and contri-
bution. However, we propose that in order to produce a become a (2) regular, who displays full commitment; and
value-creating environment that also facilitates successful (4) the leader, who sustains membership participation and
value capture and profits, firms must first understand users’ guides interactions of others. Li and Bernoff (2008) develop
behavioral dynamics in a social context. a ladder-type graph known as social technographics profiling,
which uses findings from large-scale surveys to create profiles
of online behavior. Preece and Schneiderman (2009) propose
a reader to leader framework with emphasis on different
The Ladder of Participation: The needs and values at different levels of participation. The
Dynamics of Social Content different approaches are summarized in Table 3.
Past research has investigated participation patterns in a It seems that there is a high degree of consensus among
communal setting, both offline and online. In their seminal academics and practitioners regarding the various stages of
work on learning processes in communities of practice, Lave the user’s membership life cycle. As can be easily noted, all
and Wenger (1991) proposed a characterization of community frameworks start from a reader type, who only consumes
behavior over time. They noted that newcomers “become content, and they progress to users who invest some time and
more competent as they become more involved in the main effort in making small contributions and carrying out minor
processes of the particular community. They move from acts of participation and content organization; they continue
legitimate peripheral participation to ‘full participation’” with users who invest significant time and effort in com-
(p. 37). More recently, there have been various attempts at munity participation, and they culminate (in successful cases)
creating more thorough frameworks that model users’ with a member who creates significant content, leads, and
behavior specifically in online community contexts. Kim moderates discussions in the community. Clearly, users who
(2000), for example, differentiates among several participa- “move up the ladder” invest more effort in the website and
tion roles: (1) the visitor, who exhibits unstructured partici- create more value than users who just consume content. It is
pation; (2) the novice, who invests time and effort in order to also clear that each level is associated with different social
computing features. Content organization includes the option Community participation was found to be associated with
to tag content and recommend it. Community participation affective commitment, which is a positive emotional attach-
includes joining affinity groups, posting comments, and con- ment or “feeling of belonging” to the community. In the
tributing content. Community leadership entails moderation traditional (offline) organizational commitment context,
of user groups and their respective content. affective commitment was shown to develop through social
exchanges and relationships that promote trust (Cook and
Why would one expect users to repeatedly participate in a Wall 1980) and feelings of being treated fairly by the
community and climb the levels of participation within it? community (Eisenberger et al. 1990). The practical effects of
attention from the community have been demonstrated in
In a recent study, Bateman et al. (2011) offered an over-
recent research. Joyce and Kraut (2006) showed how a user’s
arching theory: the commitment-based approach. In their
likelihood of posting is related to the properties of the replies
study, Bateman et al. showed that users’ behavior on content
he receives in response to his initial posting. Lampe and
sites is directly linked to their commitment levels, as defined
by organizational commitment theory (Meyer and Allen Johnston (2005) found that a newcomer’s probability of
1991). Content consumption was shown to be linked to con- returning to a site is affected by the ratings given to her first
tinuance commitment, commitment based on the calculation post. Huberman et al. (2009) showed, in the context of
of costs and benefits. The few studies that have investigated YouTube clips, that users whose videos attract more attention
lurkers—users who strictly consume content—found that subsequently contribute greater quantities of content. Burke
these users report mostly information benefits. If a user’s et al. (2009) quantitatively examined photo contributions on
total level of benefit is lower than the cost of finding the right Facebook and found that direct feedback on content is one of
content, he or she is likely to discontinue use of the site the factors related to the volume of content that a user
(Cummings et al. 2002; Nonnecke and Preece 2000). subsequently uploads.
Community leadership, the top level of user participation in An emerging business model that allows for such segmen-
online communities, was shown to be associated with norma- tation is the freemium (or two-tiered) model, wherein basic
tive commitment (Bateman et al. 2011).4 The organizational services are provided for free, and premium services are
commitment theory defines normative commitment as a sense offered for a fee (Doerr et al. 2010; Hung 2010; Riggins 2003;
of obligation to the community (i.e., the user participates in Teece 2010). The underlying assumption of the freemium
the community because he feels he “ought to”). Normative model is that delivering a product for free can attract a large
commitment can be influenced by repeated social exchanges number of users and encourage participation, and a small
in which a person learns about other community participants’ fraction of participants will pay for the premium offer.
values such as loyalty (Wiener 1982), or it can develop when
a person feels indebted to the community because the benefits A careful strategy for user segmentation and a tailored attrac-
he receives exceed his own contribution (Bateman et al. tive premium offer are the key to the success of the freemium
2011). Leaders of online communities have been shown to model. One widespread approach is offering a portion of the
contribute the largest number of comments and to be the most content for free and the rest for a fee. However, researchers
active (Cassell et al. 2006; Yoo and Alavi 2004). A study of have stressed that this may result in lower perceived value of
leadership in Wikipedia’s community showed that leaders use the free content, causing lower demand levels (Brynjolfsson
multiple discourse channels, utilizing many features of the et al. 2003; Fitzsimons and Lehmann 2004; for opposing
site, in order to broadcast their messages (Forte and Bruck- results see Zeithaml 1988), as well as slower growth of the
man 2008). Granted, not all users will end up being com- consumer base for the free service (Pauwels and Weiss 2008).
munity leaders, and not everyone will be involved in the
community; it is actually not necessary for everyone to do so. Linking this to the previous discussion on levels of parti-
The value proposition depends on having a critical mass of cipation, we suggest that a successful strategy for firms using
users carrying out different contributing acts. the freemium model should incorporate a new segmentation
scheme. That is, premium offers should be aimed at users
with higher levels of participation. As discussed, those users
exhibit higher levels of commitment to the website. Mar-
Linking Participation to Value Capture keting scholars have noted that commitment can yield loyalty,
and Willingness to Pay which encourages payment (Beatty and Kahle 1988; Dick and
Basu 1994). Loyalty is defined as a composite blend of both
Value capture has proved to be challenging for the traditional brand attitude and behavior and is associated with increased
online content provider. Digital content companies find it purchases (Pritchard and Howard 1997). It is also associated
difficult to charge their consumers for access to media ser- with the conscientious willingness to pay a premium price, or
vices, including proprietary content such as music, movies, alternatively the exhibition of price indifference (Fornell
and newspaper articles (Dyson 1995; Picard 2000). Con- 1992; Raju et al. 1990; Zeithaml et al. 1996).
sumers’ increased tendency to seek out better prices (Shankar
et al. 1999), widespread piracy (Jain 2008; Rob and Wald- In the context of organizational commitment theory, Fullerton
fogel 2006), and the introduction of digital sharing platforms (2003, 2005) found that only consumers who exhibited affec-
(P2P) (Asvanund et al. 2004; Bhattacharjee et al. 2007) have tive commitment expressed their loyalty in the form of
introduced new challenges for online content retailers (see “willingness to pay more,” while consumers who exhibited
also Bhattacharjee et al. 2003; Gopal et al. 2004). continuance commitment were not willing to pay for premium
service.
When content providers first adopted social computing
features, they resorted to advertising as their base revenue Connecting this to the ladder of participation discussed above,
model. However, advertising is essentially “flat”; it does not we propose that willingness to pay for premium services is
utilize the insights that come with better understanding of not associated with content consumption alone but rather is
users’ behavioral dynamics in a social context. The different associated with content organization and community parti-
levels of participation call for a business model that better cipation as well. Specifically, by testing the following
allows for user segmentation. hypotheses, we aim to show that users on higher rungs of the
ladder of participation are more willing to pay compared with
users on lower rungs of the ladder.
4
Not surprisingly, leadership behavior was also shown by Bateman et al. to
be associated with a degree of affective commitment as well, stressing the
First we hypothesize that any level of participation on the
cumulative nature of levels of participation. website—content consumption, content organization, as well
as community participation—will be associated with than will content organization and content
propensity to subscribe: consumption.
H1: User participation in the website is positively associated H4(c): Leadership of groups will have a stronger asso-
with the likelihood of subscribing to premium services. ciation with a shorter period of time of free usage
than will participation in groups.
The ladder of participation suggests that “higher” levels of
participation will be associated with higher willingness to
pay. For example, leadership roles in the community are
associated with the strongest form of commitment, normative
Data Collection and Preparation
commitment, which reflects a sense of obligation to the
The data for this research were taken from Last.fm, an online
website. Thus, our second hypothesis compares the effects of
music radio site that also functions as a social community.
different levels of website engagement on willingness to pay.
The website was purchased by CBS for $280 million in 2007
and is one of the leading proprietary music websites. Last.fm
H2(a): Content organization will have a stronger asso-
offers music streaming services5 and differentiates itself from
ciation with the decision to subscribe to premium
other online radio services with the method it uses to recom-
services than will content consumption.
mend songs to its users (also called AudioScrobbler): After
analyzing the user’s listening habits, the Last.fm engine
H2(b): Community participation will have a stronger asso-
searches for other site members with similar tastes and
ciation with the decision to subscribe to premium
recommends their favorite songs back to the user.
services than will content organization and content
consumption.
While the site’s core business is centered around providing
music-listening capabilities, Last.fm also enables the user to
H2(c): Community leadership will have a stronger asso-
create a personal profile page (similar to profile pages on
ciation with the decision to subscribe to premium
other social networking websites), link to friends’ pages, join
services than will community participation, content
groups (mostly based on musical taste), contribute to blogs by
organization, and content consumption.
posting short articles, or take a lead role in groups and moder-
ate content. Users can also add tags to artists, albums, and
Nearly by definition, commitment is a long process and
tracks by using chosen keywords and can create playlists
cannot happen overnight. Users who are climbing the ladder
(personalized radio stations) for others to enjoy (see Figure 1
of participation are experimenting with new content and
for illustration).
social activities in which they invest increasing time and
effort. It is, therefore, reasonable to expect that participating
Last.fm implements the freemium business model by offering
users will become more committed faster. In the context of
its users two levels of membership. The first is regular regis-
content websites, this means they are likely to make the
tration (free service), which enables the user to create a
decision to subscribe to premium services sooner. Hence, our
personal profile page, listen to online radio, and use other site
third hypothesis is
functions. The second is the paid subscription, in which
subscribers pay a monthly fee of $3 for a package of premium
H3: User participation is positively associated with a shorter
services that include the following:6
period of time of free usage.
• Improved infrastructure: removal of ads from the sub-
As with the subscription decision (H2), we can compare the
scriber’s page and top-priority quality-of-service on web
effects of different levels of participation on time to sub-
and radio servers.
scription. We formulate this comparison in our fourth
hypothesis:
5
Last.fm uploads songs to the website, and a user can listen to them using the
H4(a): Content organization will have a stronger asso- site’s downloadable radio software, or by using the music streams on the
ciation with a shorter period of time of free usage website directly.
than will content consumption. 6
In April 2009, Last.fm changed its business model in certain countries and
currently allows only paying subscribers to stream label-owned music.
H4(b): Community participation will have a stronger asso- However, in the United Kingdom, Germany, and the United States, the model
ciation with a shorter period of time of free usage has not changed.
• Improved content organization: capacity to listen to scribers and used only data on nonpaying users. A second
unlimited personal playlists on shuffle mode and to web crawler collected information about new paying sub-
create a “Loved Tracks” radio channel.7 scribers at the time that they purchased their subscriptions.
We were able to identify these users because Last.fm features
• The ability to see who visited one’s homepage on a list of recent subscribers, which is continually updated.8 By
Last.fm; in addition, the user’s subscription status limiting our analysis to new subscribers and omitting mem-
appears on his or her personal page. bers with previously established subscriptions, we control for
increased activity that might result from the membership
Note that none of these “premium offers” changes the music benefits of the premium subscription. Thus far we have
consumption option. That is, there is no limitation on the collected information on close to 5,000 new subscribers.
content available to nonsubscribers. Similarly, the premium
subscription does not change the functionality for community Data collection was done over a period spanning 3 months
participation and leadership. That is, there are no features that starting in January 2009. In order to omit inactive users from
are blocked to nonsubscribers, and nonsubscribers can parti- our analysis, we removed data on users who had not visited
cipate in and lead social groups and contribute to blogs in the site during the 3 months prior to data collection. We also
exactly the same manner as subscribers. omitted users and subscribers who had in the past used a
“reset” option that reset the logs of their personal site usage.
We collected data about a random sample of 150,000 Last.fm Our final data set consisted of 39,397 nonpaying users and
users (subscribers and nonpaying users). The data for each 3,612 new subscribers. Some descriptive statistics for our
user include music listening behavior, number of friends, data are presented in Table 5.
community activity levels, and demographics. Table 4 details
the data available for each user. Last.fm’s various social computing options can be sorted into
a ladder of participation. In the context of music listening on
We collected these data using two specially programmed web Last.fm, content consumption is measured either by the total
crawlers. One web crawler gathered information about a ran-
dom sample of 150,000 Last.fm users (subscribers and non-
paying users). For this data set, we omitted data on sub- 8
Our crawler collected the data from the “welcoming new subscribers” page
once an hour. We have tested and found that the page is updated practically
instantly after the subscription is paid for. Hence, the data collected reflect
7
This is a playlist created by the site based on a user’s tagging of songs as activity levels of the users before subscribing and up to an hour after
“loved.” subscribing.
number of plays or by the average daily number of song There are also demographic differences between subscribers
plays. One rung above is content organization, which can and nonpaying users. We did not observe a significant dif-
entail any one of the following activities: attaching tags to ference in activity levels or in propensity to subscribe based
songs, tagging favorite songs as loved, and creating playlists on gender. We did, however, find that subscribers are on
(a list of songs to be listened to together). The next level of average 6 years older than nonpaying users (see Table 5).
engagement, community participation, can entail any one of Given the relatively small subscription fee of $3 per month,
the following activities: joining groups, leading groups, we think it is likely that this difference is caused by dif-
publishing a post in a forum, and adding an entry to one’s ferences in income level or access to payment methods.
blog. Finally, community leadership is measured by the Interestingly, we also find that subscribers make their sub-
number of groups led by a user. scription decisions after using the site for 652 days on
average.10 This suggests that the typical subscription decision
The descriptive statistics clearly suggest that the usage pattern is not spontaneous. Rather, it requires deep familiarity with
of subscribers is quite different from that of regular users. the website and its features. This indicates that converting
Table 6 summarizes the average volume of activity attributed users from free to fee is a long process that requires patience
to different rungs of the ladder of participation for paying from website owners.
subscribers and for nonpaying users. For each type of
activity, the third column of Table 6 shows the ratio between Moreover, we find that 99.1 percent of all users (paying and
subscriber activity level and user activity level. We used the nonpaying) have listened to music, 77.6 percent have engaged
t-test and the Mann-Whitney U-test to compare nonpaying in content organization behavior, 57.9 percent have
users with subscribers, as the two populations are not participated in the community, and 5.2 percent have led a
normally distributed (Mann and Whitney 1947). group, taking a leadership role in the community. Inter-
estingly, only 8.7 percent of the users who have engaged in a
We observe that subscribers consume 23 percent more music community activity have not used the content organization
than do their nonpaying peers. Interestingly, subscribers carry features of the website. This supports the notion of a hier-
out a significantly larger number of content–organization archy of activities.
activities. On average, subscribers create 67 percent more
playlists, they choose to mark 218 percent more tracks as
loved, and they create 140 percent more tags (P < 0.01).
Methodology and Results
Most intriguingly, subscribers are substantially more active in
To better understand the interplay of content consumption,
the site’s community: Compared with nonpaying users,
content organization, community activity, and willingness to
paying subscribers write 199 percent more posts on the site’s
pay for a subscription, we estimated a logistic (binary) choice
forums, join 70 percent more groups, lead on average 142
equation, predicting the probability of paying for a
percent more groups, and publish 111 percent more blog
subscription.11 Formally, we estimated the following block
entries (P < 0.01).
equation:
Moreover, paying subscribers have more friends listed on U i ( Subscribe) = α 0 + α1ContentConsumptioni
their pages. Table 7 shows that whereas the average non- J
paying user has slightly more than 14 friends, the average + βij ContentOrganizationi + α 2 FriendsCount i
subscriber has 21 friends, that is, subscribers have on average j =1
K
45 percent more friends (P < 0.01). Prior literature on social + α 3 SubscriberFriendsCount i + γ ik CommunityParticipationi
influence provides some additional explanations for purchase k =1
9 11
As we collect the data at the moment of subscription, we can know that the Since premium services are offered for a fixed monthly fee, we use a
friends paid before the focal user did. logistic regression model with a binary dependent variable.
Groups Joined
Tags Created
Loved Tracks
Forum Posts
Blog Entries
Groups Led
Song Plays
Subscriber
Subscriber
Number of
Number of
Published
Playlists
Created
Tagged
Gender
Friends
Written
Days
Age
Gender 1.000
Age -.186** 1.000
Days -.063** -.022* 1.000
Num. of .062** -.063** .172** 1.000
Friends
Num of Sub. .021* .149** .097** .717** 1.000
Friends
Song Plays -.080*** -.059** .367** .343** .245** 1.000
Playlists .003* .139** -.034** .146** .238** .079** 1.000
Created
Loved Tracks -.008 .115** .047** .208** .284** .179** .350** 1.000
Tagged
Forum Posts -.009 .019* .063** .134** .155** .161** .009** .091** 1.000
Published
Groups Joined -.028** -.043** .126** .373** .312** .242** .065** .165** .148** 1.000
Groups Led -.044** -.014 .127** .236** .185** .189** .021** .067** .122** .376** 1.000
Blog Entries -.002** .028** .173** .293** .263** .251** .063** .130** .144** .267** .251** 1.000
Written
Tags Created -.035** .066** .078** .172** .178** .159** .110** .216** .101** .221** .161** .204** 1.000
Subscriber? -.051** .363** -.074** .121** .327** .068** .144** .186** .055** .112** .088** .124** .122** 1.000
Content consumption is estimated using the total number of model). Thus, the conditional probability, Pri, that consumer
song plays (in thousands) to which user i listened. We also i chooses to pay for a premium subscription is given by the
repeated the analysis using the average daily number of song usual expression
plays to which user i listened with similar results.12 The con-
tent organization activities include tagging of songs, creating exp(Vi )
Pri =
playlists, and marking songs as loved. FriendsCount is the 1 + exp(Vi )
number of friends listed on a user’s personal page, and
SubscriberFriendsCount is the number of friends listed on the Estimating this model presented us with two econometric
user’s personal page who became subscribers prior to the challenges: First, we needed a control for increased use of the
focal user’s decision.13 The community participation acti- site due to the actual subscription decision. It is possible that
vities include joining groups, leading groups, posting in a after subscribing to premium services, consumers tend to use
forum, and adding an entry to a personal blog. Demographics the site more because of the benefits a subscription provides.
include age, gender, and the number of days since the user For that reason, we limited our analysis to nonpaying users
started using the website. The error terms εi are assumed to and to new subscribers whose data had been collected
follow an extreme value distribution (i.e., we use the logit immediately following the time of subscription, that is, before
their usage could be influenced by the subscription itself. We
therefore merged two sets of data: one consisting of ran-
12
To clarify, the number of songs is the number of plays. That is, if a user domly chosen nonpaying users, and one consisting of users
listened to a song twice, it will be counted as two “tracks listened to.”
who had just purchased a subscription.
13
The goal of separation between total number of friends and subscriber
friends is to capture possible peer effects in the subscription decision. See
Second, when we looked at the random set of users on whom
Bapna and Umyarov (2012) for a discussion of peer effects in the context of we collected information, we noticed that subscribers made up
Last.fm. only 0.89 percent of the site population. If we used this cor-
rect ratio in composing our data set, the occurrence of ones in Community Leadership: Group leadership has a much
our dependent variable (Subscribe) would be a rare event. stronger association with the subscription decision than group
The biases that rare events create in estimating logit models membership has. Specifically, our results suggest that being
have been discussed in the literature (Ben-Akiva and Lerman a leader of one more group has a stronger effect on the odds
1985). Briefly, this poses a problem when estimating a logit ratio than being a member of 10 additional groups (P = 0.02).
model, because the model would predict that everyone would Hence, H2(c) is clearly supported.
be a regular, nonsubscribing user while still obtaining a 99
percent level of accuracy. To overcome the problem of mis- Content Organization: We also find that content-
classification, one should re-estimate the model while deliber- organization activities, including marking tracks as loved and
ately under-sampling the nonpaying users, so that a more creating playlists, are positively correlated with subscription
balanced sample of ones and zeros in the dependent variable behavior (Odds Ratio = 1.001 for each track marked as loved,
is obtained. This sampling technique is called choice-based and Odds Ratio = 1.184 for each playlist created). Creating
sampling (Ben-Akiva and Lerman 1985). To this end, we tags for songs was not found to be statistically significant in
used our collected set of 3,437 new subscribers and only the full model. While tagging songs as loved has a weaker
9,537 nonpaying users. However, using choice-based samp- association with the subscription decision compared with
ling leads to inconsistent intercept estimation when traditional participation in community activities, creating a playlist has
maximum likelihood methods are used. Two alternative a very strong effect on the odds ratio. Hence, H2(b) is only
solutions have been suggested in the literature: Manski and partially supported.
Lerman (1977) developed a weighted endogenous sampling
maximum likelihood (WESML) estimator, which accounts for Content Consumption: As expected, content consumption
the different weights in the zeros and ones from the popula- has a positive association with the subscription decision,
tion of interest. However, this estimator has the undesirable supporting H1. Interestingly, content consumption is asso-
property of increasing the standard errors of the estimates ciated with a relatively low effect on the subscription decision
(Greene 2000; Manski and Lerman 1977). A second ap- and is not significant in all models. Looking at our full
proach, which we follow, is to adjust the estimated intercepts model, it seems that the effect of posting an additional entry
for each alternative by subtracting the constant ln(Si/Pi) from to a blog is equal to that of playing over 10,000 more songs
the exogenous maximum likelihood estimates of the intercept, (P < 0.01). Similarly, being a member in one more group has
where Si is the percentage of observations for alternative i in an effect on the odds ratio equal to listening to 100,000 more
the sample, and Pi is the percentage of observations for songs (P < 0.01). These findings support H2(a) and suggest
alternative i in the population (Manski and Lerman 1977; for that willingness to pay is more strongly linked to community
a similar implementation, see Villanueva et al. 2008). activity and to content organization activities than to content
consumption. These results are especially interesting given
The correlation matrix is presented in Table 7, and the esti- that the core business of the website is providing content, and
mation results using the choice-based sample are reported in that most of the features provided to the paying subscribers
Table 8, each column representing an additional block being are closely related to the content-consumption experience.
added to the estimation.
Social Influence: As expected, we also find that the number
of subscriber friends (i.e., friends who have already purchased
Estimation Results a paid subscription) listed on a user’s page is associated with
a strong positive effect on the user’s propensity to pay for
The number of different community activities, the number of premium services (Bapna and Umyarov 2012). When we
content organization activities, and the level of content control for the number of subscriber friends, we find that the
consumption are strongly and significantly associated with the number of friends without a subscription has a small negative
likelihood of subscription, supporting H1. association with the subscription behavior. This could indi-
cate that nonsubscribing friends create negative word of
Community Participation: Joining a group, leading a group, mouth regarding the subscription decision, either verbally or
and posting a blog entry are each associated with a significant through observational learning.
increase in the odds of subscribing to premium services (Odds
Ratio = 1.007 for each group membership, Odds Ratio = Demographics: The age of the user is positively associated
1.226 for each group leadership, and Odds Ratio = 1.051 for with the likelihood of subscription, but gender has no signi-
each blog entry). Note that posting a comment in a forum ficant effect. More interestingly, the number of days since the
does not have a significant association with the subscription user started using the website is found to be negatively
decision. associated with the subscription decision.
Observations: 13,004. **Significant at the 0.05 level. ***Significant at the 0.01 level.
The Effect of Community Participation it reaches a relatively stable level. However, there seems to
on Time until Subscription be a consistent and stable increase over time in the likelihood
of participating in the different content organization and
We find that subscribers make their subscription decisions community activities.
after using the site for 652 days on average. This suggests
that the typical subscription decision is made by a user who In what follows, we investigate the effect of content con-
is deeply familiar with the website and its features. Figure 2 sumption, content organization, and community activity on
presents the consumption and participation patterns of dif- the likelihood of consumers to purchase a paid subscription.
ferent users as a function of time. Notably, in the first year of We therefore estimate a hazard (survival) model, using the
using the website, a user’s music consumption decreases until following equation:
Figure 2. Content Consumption Levels and Usage of Social Computing Features over Time
J
These results provide yet another dimension to our previously
α 0 + α1ContentConsumptioni + βij ContentOrganizationi
j =1
reported results: not only is community activity associated
+ α 2 FriendsCount i + α 3 SubscriberFriendsCount i with a greater willingness to pay for a premium subscription,
Hi (t ) = exp K it is also associated with a shorter time window between
+ γ ik CommunityParticipation + α 4 CommunityLeadershipi
joining the website and subscribing.
k =1
L
+ δil Demographicsi
l =1 Given the long period of time in question and the potential for
exogenous changes in consumer taste, for robustness we
This model allows us to study how the different covariates repeated our analysis with a few subsamples of users who had
are associated with the “hazard” (in this case, a positive joined the site more recently before subscription (subscribers
hazard in the form of a subscription decision). We use the who had been on the website less than 800, 600, and 400 days
Cox regression to estimate these effects. The results of this prior to subscription). The results are very similar, both in
estimation are presented in Table 9. sign and magnitude. Note that as freemium models become
more prevalent in the content industry, consumers may become
The results show that community activity and content organi- more receptive to paying, and the time window to subscription
zation activity variables are each positively associated with may become shorter.
the hazard rate. That is, users who are more active in the
community or who actively organize content will make the
subscription decision sooner than users who are less active or Propensity Score Matching
not active at all (supporting H3). Moreover, the strong and
significant positive association between group leadership and Although the preceding econometric analysis provides support
the subscription decision again stands out, supporting H4(c). for a positive and statistically significant association between
online community activity and propensity to purchase a nonexperimental data, based on observed variables.14 The
premium-service subscription, the nature of observational data objective of propensity score matching is to assess the effect
raises concerns about the causal interpretation of our findings. of a treatment by comparing observable outcomes (in our
As mentioned above, through our sampling technique, we case, subscription behavior) among treated observations (in
control for possible post-subscription increases in site usage. our context, users who participate in the website’s com-
However, we do not control for the bias caused by self- munity) to a sample of untreated observations (in our context,
selection. That is, since we did not randomly assign users to users who did not participate in the website’s community)
treatment groups (increased community activity), we are matched according to the propensity of being treated (that is,
unable to control for observed and unobserved variables that the propensity to participate).
drive users to self-select themselves into a particular treatment
group. It is easy to think of variables that might influence Mathematically, let yi,1 denote the outcome of observation i,
users’ community activity levels and simultaneously increase if the treatment occurs (given by Ti = 1), and yi,0 denote the
their propensity to pay for premium services, hence creating outcome if the treatment does not occur (Ti = 0). If both states
a self-selection bias. of the world were observed, the average treatment effect, τ,
would equal y1 – y0, where y1 and y0 represent the mean out-
A solution to the self-selection bias is to use a proportional comes for the treatment group and control group, respectively.
outcome approach. Selection bias due to correlation between However, given that only y1 or y0 is observed for each obser-
the observed characteristics of a user and the user’s level of vation, unless assignment into the treatment group is random,
social activity (his treatment level) can be addressed by using generally, τ … y1 – y0.
a matching technique based on propensity scores (Rosenbaum
and Rubin 1983; for a recent use of propensity scores in the Propensity score matching attempts to overcome this problem
marketing context, see Aral et al. 2009; Mithas and Krishnan by finding a vector of covariates, Z, such that (y1, y0) z T|Z,
2009). The fundamental problem in identifying treatment pr(T = 1|Z) 0 (0, 1), where z denotes independence. That is,
effects is one of incomplete information. Although we ob-
serve whether the treatment occurs and whether the outcome 14
In contrast, selection bias stemming from correlation between unobserved
is conditional on the treatment assignment, the counterfactual
variables and the user’s social activity level is a more difficult problem.
is not observed. In a nutshell, propensity matching techniques Previous literature has often used the strong ignitability assumption
enable us to investigate heterogeneous treatment effects in (Rosenbaum and Rubin 1983).
the treatment assignment is independent of the outcome matches. Since exact matching is often untenable, Rosen-
conditional on a set of attributes Z. Moreover, if one is baum and Rubin (1983) prove that conditioning on p(Z) is
interested in estimating the average treatment effect, only the equivalent to conditioning on Z, where p(Z) = pr(T = 1|Z) is
weaker condition, E[y0|T = 1, Z] = E[y0|T = 0, Z] = EE[y0|Z], the propensity score. That is, for each consumer we estimate
pr(T = 1|Z) 0 (0, 1), is required. p(Z)—the propensity of being treated (in the previous
example, the propensity of leading a group)—using a probit
To implement the matching technique, we define the treat- model. We thereafter match consumers not according to their
ment group as the set of people who participated in commu- exact attributes but according to their propensity scores. One
nity activity. Since most propensity score matching techni- of the advantages of propensity score methods is that they
ques use a binary treatment, we grouped user participation in easily accommodate a large number of control variables.
community activities into four distinct binary treatments and
repeated the following exercise for each treatment separately: Upon estimation of the propensity score, a matching algo-
rithm is defined in order to match the treated and untreated
• GroupLead, which is equal to one if the user has ever led cases. We used the kernel matching estimator matching
a group technique (Heckman 1997).16 We were then able to compare
• BlogEntry, which is equal to one if the user has ever the percentage of subscribers between the treated and the
posted an entry to a blog matched untreated groups. For the CommunityActivity vari-
• GroupMember, which is equal to one if the user has ever able, we repeated the estimations using the Mahalanobis
joined a group matching technique, a method specifically designed for
• ForumPost, which is equal to one if the user has ever multiple treatments (Rubin 1980). Using this method, one
posted an entry to a forum page estimates a different propensity score for each treatment
included in the CommunityActivity variable (i.e., posting to a
Additionally, we group all of the user’s community activities forum, group membership, and blog entry), and users are then
into one binary variable, CommunityActivity, which is equal matched on the basis of these multiple scores.
to one if the user has ever posted an entry to a blog, joined a
group, or posted an entry to a forum page. The results of our comparisons for each of the treatments are
presented in Table 10. Column A in Table 10 corresponds to
In our context, we are able to identify a number of observed the case in which the treatment is defined as GroupMember.
variables that might influence a consumer’s propensity to In this case each consumer who has a group membership is
engage in social activity and should, therefore, be included in matched with a consumer who does not have a group
the covariates in Z. We estimate the propensity to participate membership, according to the above-mentioned covariates
or contribute to the community based on demographic (including demographics, music listening, and friends). Out
information (including gender and age), music consumption of the 29,941 consumers with group memberships, 8.5 percent
patterns (including the number of song plays, and the number were found to have a subscription. However, out of the
of days on the Last.fm site), and the number of friends listed 29,941 consumers who were matched to those consumers (but
on the user’s page.15 were not group members) only 6.9 percent had a subscription.
Since this difference is statistically significant (P < 0.001), we
Consequently, we should match observations that have iden- are able to conclude that, controlling for the observed differ-
tical values for all variables included in Z. For example, in ences between the groups, consumers who are group members
the case of the GroupLead treatment, we should match a 22- are more likely to pay for a premium subscription. Similar
year-old male consumer who listened to 1,000 songs, has analysis for the other four treatments (group leadership, forum
been using Last.fm for a year, and is a group leader, with posting, blog entries, and any community activity) is
another 22-year-old male who listened to 1,000 songs and has presented in columns B to F of Table 10. Note that Commu-
been using Last.fm for a year, but who is not a group leader.
However, if we do that, we might find very few exact
16
We chose the kernel matching technique because of its treatment of the
“distance” between the matched and unmatched cases through weights.
15
For robustness, we repeated the estimations using the other activities as Kernel matching gives more weight to close neighbors while still assigning
covariates as well. That is, when estimating a person’s propensity to perform some weight to the more distant neighbors. The potential benefit is that these
a certain activity, we included the other activities of the person in the estimators are less sensitive to a mismatch along unmeasured dimensions, but
propensity estimations. For example, when estimating the propensity to write the cost is that they introduce an added mismatch along measured dimen-
a blog entry, we included group membership and posts to forums into the sions. For robustness, we repeated the analysis using the nearest neighbor
score estimations. matching algorithm, with very similar results.
nityActivity was estimated twice, once using the kernel pj denotes the subscription probability of a user with no com-
matching approach (column E) and once using the Maha- munity activity (i.e., a user in the control group), Rosenbaum
lanobis matching approach (Column F). All of these esti- shows the following bounds on the odds ratio for the two
mates provide similar conclusions: After controlling for self- matched users
selection bias based on demographics, music consumption,
and number of friends, we observe a significant difference 1 p / (1 − pi )
≤ i ≤Γ
between the treated and untreated conditions in the mean Γ p j / (1 − p j )
percentage of users who subscribe to premium services.
where Γ $ 1. Γ measures the level of selection effects from
These differences emphasize the effect of community parti- unobservable factors. When Γ = 1, users with the same pro-
cipation on the propensity to subscribe to the website, and pensity scores have the same probability of subscribing, and
they strengthen the findings of the binary logistic model. (A there are no unobserved selection effects. When Γ > 1, an
comparison of covariate means both before and after the unobservable causes the odds ratio of treatment assignment to
matching are presented in Appendix A.) differ between treatment and control groups. This method is
based on the intuition that Γ should be close to 1 if the
unobservable does not play a significant role in selection.
Test statistics are developed to show how far away Γ has to be
Rosenbaum Bounds Sensitivity from 1 in order for the unobservable to nullify the treatment
Analysis effect. This, of course, depends on the context of the
research—if the Rosenbaum bound for unobserved selection
Propensity-score matching operates on a strong assumption (Γ) appears too large to be true in reality in the specific
that observable characteristics fully account for the selection context, a researcher may conclude that the qualitative results
of users into the treatment and control conditions. However, from propensity matching hold.
there could still be hidden bias due to unobservable charac-
teristics. We next conduct a sensitivity analysis by estimating We present the results of the Rosenbaum bounds analysis in
Rosenbaum bounds (Rosenbaum 2002), which measure how the bottom row of Table 10.17 We report the critical values of
strongly an unobservable must influence the selection process Γ at which the community activity effect becomes insignifi-
in order to completely nullify the causal effects identified in cant. Those values range between 1.4 and 1.8 and are similar
the propensity-matching analysis (for recent applications of to the findings of Sun and Zhu (2012) and DiPrete and Gangl
this method, see Sen et al. 2012; Sun and Zhu 2010). If pi (2004). In other words, an unobservable variable would have
denotes the subscription probability of a user who conducted
community activity (i.e., a user in the treatment group), and 17
We only include this for the five binary treatments.
to change the odds of selection into the treatment group by at mitment based on cost–benefit analysis, engagement created
least 50 percent to nullify the effect of community activity on by social computing might increase affective and normative
the subscription decision. commitments. In line with previous research that links com-
mitment to willingness to pay, we find that among such users,
the presence of the affective community may be associated
with monetary payments to the website. Analysts have noted
that people report that they are not willing to pay for online
Discussion
content (Nielsen 2010); our observations suggest that involve-
ment in a community on a content website might serve as a
Our empirical analysis supports our conjecture that users’
key to overcoming that obstacle.
levels of participation are linked to their willingness to pay for
premium service. We find that users who are more active in
We extend our results in two directions. First, we use a
the community are substantially more likely to pay for
hazard model to study the effect of community activity on the
premium services, and this effect is observed even after
time between joining the website and the subscription
accounting for content consumption, demographics, and
decision. We find that users who are more active in the
social influence. We also find that, in the context of music
community will make the subscription decision sooner after
content, community activity is more strongly associated with
joining compared with users who are less active (or not active
the likelihood of subscription than is the music consumption
at all). Moreover, we again see the strong association bet-
itself, and community leadership is more strongly associated
ween group leadership and the subscription decision. These
with the likelihood of subscription than is mere community
results suggest that a consumer’s community activity is asso-
participation.
ciated not only with increased willingness to pay for a
premium subscription but also with a shorter time window
Among all the social attributes we examined, the number of
between joining the website and subscribing. This indicates
subscriber friends, the number of playlists, the number of
that community participation can act as a catalyst for pur-
groups led, and the number of blog entries are the factors
chasing decisions in online content websites.
most strongly associated with the purchase decision. The first
two observations are not surprising in our context. Past
Second, we extend our results by using propensity score
research has already shown how social interactions in online
matching, a method of estimating treatment effects from non-
environments can influence purchasing decisions (Godes and
experimental data. Previous research on willingness to pay
Mayzlin 2004; Huang and Chen 2006). The effect of playlist
has used surveys or interviews in order to assess purchasing
creation, in turn, might be a fairly obvious outcome of the
intent (Riggins 2003; Srinivasan et al. 2002; Ye et al. 2004).
extended playlists option that a premium subscription pro-
By using a data set of users who are currently active on the
vides in the website we study. However, none of the pre-
content website, we were able to study actual purchasing
mium services directly improves the user’s ability to lead
decisions without the biases commonly associated with
groups or to post to blogs. In fact, most of the benefits asso-
surveys. The featured list of recent subscribers, updated in
ciated with a subscription—including higher bandwidth,
real time, allowed us to avoid the influences of post-
access to new music features, and removal of ads from the
subscription behavior and to properly compare a subscriber’s
user’s page—are not directly related to the community aspects
profile to that of a nonpaying consumer.
of the website.
Although we did not control for unobserved heterogeneity in
Our findings support the notion of a hierarchy, portrayed in
treatment assignment, propensity score matching allowed us
the literature on levels of participation in online communities.
to control for self-selection bias based on consumption pat-
According to this hierarchy, group leadership and blog
postings are at the top end of user participation behavior, terns, demographics, and social influence levels. The addi-
whereas acts of content organization and consumption reflect tional Rosenbaum bounds sensitivity tests showed that an
lower levels of participation. The group leader is in charge of unobservable variable would have to change the odds of
moderating the group’s discussions and adding new members selection into the treatment group by at least 50 percent to
to its community. The active blogger creates his own space nullify the effect of community activity on the subscription
and frequently shares his written thoughts with the entire decision. We show that the contribution of content to the
Last.fm community. An explanation for the correlation bet- community increases contributors’ willingness to pay for pre-
ween these activities and the purchase decision can stem from mium services. This provides the first evidence as to the
the connection between these activities and levels of commit- causal effect of community activity on consumers’ willing-
ment. While consuming content reflects a continuance com- ness to pay.
Implications for Digital Business Strategy teristics of the user base, of the attributes and type of content
offered by the specific provider, and the provider’s ethics and
This study proposes a new perspective on digital business values. For example, a music website that focuses on con-
strategy for the content industry in an age of social com- temporary music might choose to implement a social experi-
puting. Prior transitional processes of content digitization and ence that encourages the sharing and discovery of new music
net-enablement caused the content industry to move from while helping users express their unique identities. This might
offline to online platforms, where content providers now mean relatively low restriction on the language and style of
conduct most of their business. The social era we live in is interaction. A traditional news website, on the other hand,
bringing about new changes in business practices and models might want to adopt a social experience that revolves around
and is raising new questions that were not part of the discus- discussions of content and helps users to act as “news aggre-
sion on net-enablement. For example, past research on net- gators.” This website might restrict user content generation
enablement looked at methods of attracting customers by to maintain the reliability of the news source and to maintain
deploying net-based technologies and encouraging interaction a more upscale style of interaction. Clearly, the choice of
between the firm and users (Straub and Watson 2001; features will affect the nature of the resulting social environ-
Wheeler 2002). However, these studies did not take into ment in the website, which in turn will affect the segment of
account the role that technologies fulfill in facilitating and consumers drawn to the website, their valuation of the web-
enhancing users’ onsite relationships with other users. As
site, and their retention. This is not to say that one will be
social computing becomes increasingly widespread, these
more “social” than the other, but that the kind of social
formed relationships are likely to become fundamental
environment induced by one’s choice of features should align
components of digital business strategies.
with the overall strategy of that website.
Moreover, prior discussions on business models for the
Hence, digital business strategy of content providers should
content industry by academics and practitioners have focused
mainly on the choice of revenue sources, frequently men- align incentives for users to move up the rungs of the partici-
tioning examples such as advertising, fixed subscriptions and pation ladder and, in parallel, use the same participation
paying for content items. This stresses a techno-centric ladder as a segmentation mechanism in order to capitalize on
approach that views social features as add-ons, enhancements the different levels of participation. Existing strategies that
to the core offerings. We suggest that social technologies focus on limiting the amount of content consumed before
should be fused to the business processes of content pro- paying (such as NYTimes.com) or segmenting the content
viders, whose role will be to provide interactive content con- itself into free and premium (such as WSJ.com, the Wall
sumption experiences, or social content. This fusion blurs the Street Journal online) do not capitalize on the fusion of con-
previously acknowledged dichotomy between the business tent with the social environment. In such cases, even if the
models and strategies of content providers and those of virtual content provider succeeds in creating a vibrant community, it
communities (for example, as presented in Weill and Vitale still lacks a strategic approach for turning users’ emergent
2001). patterns of participation into profits.
Taken together, our results highlight the importance of bility that subscriptions might detract from ad revenue. As
creating a community environment that facilitates different noted above, one of the benefits of a premium subscription on
levels of participation to create an ongoing and varied experi- Last.fm is the removal of ads from one’s personal page.
ence. Two of the activities that were most strongly linked to Last.fm, like most firms, does not disclose exactly how many
the subscription decision—blog creation and moderation of paying subscribers it has or how much revenue it receives
content (by group leadership)—are of a high-participation from advertisements. However, in our data set, which
nature and are likely to occur in advanced stages of commu- included 150,000 randomly chosen users of Last.fm, there
nity membership. By offering a variety of social features, a were 1,335 paying subscribers. This implies a conversion rate
website can create the full ladder of participation and encour- of about 0.9 percent. This number is in line with numbers
age users to advance toward this high level of involvement,
reported by other websites, whose conversion rates are
potentially increasing the chances that they will subscribe. A
between 0.5 and 15 percent, but are often on the low side
website owner should make features available and easy to use,
(Anderson 2009). Given that Last.fm has about 30 million
while making sure users are aware of their existence. Our
research suggests that content providers should not ask registered users and the monthly subscription fee is $3, we
themselves “How will I make my users pay?” but rather estimate that the revenue from premium subscriptions is about
“How will I make my users participate more?” The solution $9.6 million a year. Since these are all digital services, with
to this question may increase free-to-fee conversion rates. low marginal costs, the profit margins on this amount are
estimated to be very high. Hence, even with a low conversion
While Last.fm is a good example for the inclusion of social rate and a relatively low monthly subscription fee, sub-
features as an inherent part of the website experience, it seems scriptions are a substantial source of income for the website.
that the website has not yet fully capitalized on it. First, to the Moreover, given the vast number of registered users, even a
best of our knowledge, Last.fm undertakes little initiative to small change in users’ propensity to subscribe will result in a
encourage users to climb the ladder of participation. Re- substantial increase in profit. For example, a 10 percent
searchers as well as practitioners have noted that many users increase in the conversion rate, from 0.9 to 1.0 percent, will
of content websites ignore community features and stay at the result in an additional $1,188,000 per year.
first level of participation (i.e., lurking), whereas only a few
make their way to the highest level of participation (Li and While there are no official reports on the profitability of
Bernoff 2008). Hence, merely offering a community might advertising business models, the convention is that the adver-
not be enough; websites may need to actively help users move tising conversion rates on search engines such as Google are
on to the next level of participation. Previous research has about 2 percent,18 whereas the conversion rates reported by
indicated that consumers move up the ladder starting at acti- social networks (such as Facebook) are about 0.051 to 0.063
vities that require low levels of participation, such as content percent.19 The reported average payoff of a click-through on
organization activities. Therefore, it could be wise to not
a Google ad is 5 cents. Of course, this conversion rate is with
immediately invite consumers to participate in activities
regard to page views. A simple calculation, therefore, shows
requiring high levels of participation (such as group leader-
that for an average click-through rate of about 0.05 percent,
ship), but rather offer incremental changes in the levels of
participation. This can be done in different ways. One a $3 monthly fee is equivalent to about 120,000 page views a
approach is to suggest a consequent activity of a higher level month. While this is a very rough estimate, it is clear to see
upon completion of an activity. For example, a user who that a paying member generates much more profit than a non-
consumes content might be asked to tag it, a user who tags paying member who is exposed to ads. Therefore, given the
content might subsequently be asked to also review it in a challenges of the advertising business model, a careful discus-
forum, a user who is active in discussions might be asked to sion of new strategic means by which firms can increase, even
lead the forum, and so on. This might help increase the by a small fraction of a percentage, users’ willingness to pay
percentage of users who reach high levels of participation. for premium subscriptions is of great importance to this
industry.
Second, while our results show a clear relation between social
behavior and willingness to pay, Last.fm and other freemium
websites currently choose not to base the premium offer on
social “perks.” Changing the premium subscriber benefits to 18
reflect his social nature might improve conversion rates. As reported on the Google Help page for AdWords (http://www.google.
com/support/forum/p/AdWords/thread?tid=7aeb3290fd8feccb&hl=en).
It is important to note that the strategy of promoting commu- We also focused on an on-site community. However, we do
nity participation is likely to work best in content sites that not have data on sites that implement their ladder of partici-
achieve high readership, such as successful mainstream news pation using external communities such as Facebook. It is
or music websites that cater to a variety of users. This is true possible that such websites could still capture value from
for two reasons. The first is that such sites have substantial users’ commitment. Those extensions would both serve as
numbers of users who start at the first stage of participation. interesting directions for future work.
Even if just a small percentage of these users progress to
become highly engaged and eventually contribute payments Our study used real-world data, in which the subscription
to the site, they might still constitute a large population that package offered to Last.fm users included one set of premium
can benefit the site’s overall income. Second, websites that services. It is impossible to know which premium service, if
implement social computing features are also prone to net- any, appealed most to the new subscribers. Future research
work externalities, and thus a consumer’s value is greatly should consider a controlled experimental setting, where
affected by fellow consumers’ behavior. A site with high different bundling packages can be explored. Such research
readership in which some users progress to content organi- should aim to unbundle the service packages and link the
zation and contribution can affect other people’s experience willingness to pay for different services to different com-
of the website, their satisfaction, and ultimately their reten- munity activities.
tion. For similar reasons, websites that begin with a small
number of content readers might have problems implementing As in other, similar empirical studies, it is impossible to
such a model, as only a few users will eventually pay, and the account for the unobserved consumer characteristics that
cost of community building may be unsustainable. Such web- might influence the subscription decision. In this case, our
sites might prefer to use the services of existing social media rich data set has allowed us to control for different behaviors
companies, for example, by building a fan page on Facebook and attributes observed online. We have also implemented a
or on Twitter. propensity score matching technique to further control for
observable variables. In addition, we substantiated our con-
clusions by computing the Rosenbaum bound of unobservable
Limitations and Future Work effects. Nevertheless, there are still correlated unobservables,
such as aversion to advertising, that should be handled in
This research was carried out on the Last.fm website, which future work, perhaps using an experimental setting. Further-
allowed exploration of different social computing features. more, richer data about the local (person-to-person) social
Last.fm is a leading music-providing website and also has a activity of consumers might provide interesting insights into
relatively active community, in which a variety of social the extent and nature of peer influence on the subscription
features are offered to the users, making it a fruitful source of decision. Finally, our research focuses on consumers’ usage
data for research of this type. Nevertheless, future research levels in the period prior to the subscription decision. An
should investigate websites that provide different types of extension of the research to post-purchasing behavior (e.g.,
content such as news or video. Furthermore, Last.fm is an through the use of panel data) could have provided additional
intermediary and not a content creator. Content creators, such support to our findings. We encourage fellow researchers to
as The New York Times, deliver original content. As there are further investigate how new social possibilities can be incor-
no perfect substitutes for original, unique content, some may porated into digital business strategies.
argue that consumers’ willingness to pay for such content will
be higher, and therefore that content creators may not need to
add community features to their websites. However, Last.fm Acknowledgments
has a unique (patented) music recommendation system that
creates a unique experience for the user. Furthermore, We thank Ravi Bapna, Jacob Goldenberg, Vijay Gurbaxani, Arun
Sundararajan, and Barak Libai, and seminar participants at the
original content creators face similarly low willingness to pay,
University of California, Irvine; New York University; Indiana
which in turn creates financial difficulties (Nielsen 2010). University; the International Conference on Information Systems;
Investment in social computing features may, therefore, be the Marketing Science Conference; and the Workshop on Infor-
beneficial for those websites as well. mation in Networks for their feedback. We would also like to thank
Oren Ziv for introducing us to Last.fm. Financial support from the
In addition, we focused on a proprietary content website. Google Inter-University Center for Electronic Markets and Auc-
While it is possible that our findings can be extended to tions, support of the Henry Crown Institute of Business
websites that offer user-generated content as well, we have no Research and the Rothschild-Caesarea fund is gratefully acknowl-
data on such websites. edged.
Huberman, B. A., Romero, D. M., and Wu, F. 2009. “Crowd- Pauwels, K., and Weiss, A. 2008. “Moving from Free to Fee: How
sourcing, Attention and Productivity,” Journal of Information Online Firms Market to Change Their Business Model Success-
Science (35:6), pp. 758-765. fully,” Journal of Marketing (72:3), pp. 14-31.
Hung, J. 2010. “Economic Essentials of Online Publishing with Pew Research Center. 2010. “Understanding the Participatory
Associated Trends and Patterns,” Publishing Research Quarterly News Consumer,” (http://pewresearch.org/pubs/1508/internet-
(26:2), pp. 79-95. cell-phone-users-news-social-experience).
Jain, S. 2008. “Digital Piracy: A Competitive Analysis,” Mar- Picard, R. G. 2000. “Changing Business Models of Online Content
keting Science (27:4), pp. 610-626. Services—Their Implications for Multimedia and Other Content
Joyce, E., and Kraut, R. E. 2006. “Predicting Continued Parti- Producers,” International Journal on Media Management (2:2),
cipation in Newsgroups,” Journal of Computer-Mediated Com- pp. 60-68.
munication (11:3), pp. 723-747. Posner, R. A. 2005. “Bad News,” New York Times, July 31 (http://
Kim, A. J. 2000. Community Building on the Web, Berkeley, CA: www.nytimes.com/2005/07/31/books/review/31POSNER.html
Peachpit Press. ?pagewanted=print).
Lampe, C., and Johnston, E. 2005. “Follow the (Slash) Dot: Preece, J., and Schneiderman, B. 2009. “The Reader-to-Leader
Effects of Feedback on New Members in an Online Community,” Framework: Motivating Technology-Meditated Social Parti-
in Proceedings of the 2005 International ACM Conference on cipation,” AIS Transactions on Human–Computer Interaction
Supporting Group Work, New York: ACM Press, pp. 11-20. (1:1), pp. 13-32.
Lave, J., and Wenger, E. 1991. Situated Learning: Legitimate Pritchard, M. P., and Howard, D. R. 1997. “The Loyal Traveler:
Peripheral Participation, Cambridge, UK: Cambridge Univer- Examining a Typology of Service Patronage,” Journal of Travel
sity Press. Research (35:4), pp. 2-10.
Li, C., and Bernoff, J. 2008. Groundswell: Winning in a World Raju, J., Srinivasan, S. V., and Lal, R. 1990. “The Effects of Brand
Transformed by Social Technologies, Boston, MA: Harvard Loyalty on Competitive Price Promotional Strategies,” Manage-
Business Review. ment Science (36:3), pp. 276-304.
Lu, Y., and Ramamurthy, K. 2011. “Understanding the Link Riggins, F. J. 2003. “Market Segmentation and Information
Between Information Technology Capability and Organizational Development Costs in a Two-Tiered Fee-Based and Sponsorship-
Agility: An Empirical Examination,” MIS Quarterly (35:4), pp. Based Web Site,” Journal of Management Information Systems
931-954. (19:3), pp. 69-81.
Mann, H. B., and Whitney, D. R. 1947. “On a Test of whether One Rob, R., and Waldfogel, J. 2006. “Piracy on the High C’s: Music
of Two Random Variables Is Stochastically Larger than the Downloading, Sales Displacement, and Social Welfare in a
Other,” Annals of Mathematical Statistics (18), pp. 50-60. Sample of College Students,” Journal of Law and Economics
Manski, C., and Lerman, L. 1977. “The Estimation of Choice (49:1), pp. 29-62.
Probabilities from Choice-Based Samples,” Econometrica (45:8), Rosenbaum, P. R. 2002. Design of Observational Studies, New
pp. 1977-1988. York: Springer.
Meyer, J. P., and Allen, N. J. 1991. ”A Three-Component Rosenbaum P. R., and Rubin, D. B. 1983. “The Central Role of the
Conceptualization of Organizational Commitment,” Human Propensity Score in Observational Studies for Causal Effects,”
Resource Management Review (1:1), pp. 61-89. Biometrika (70:1), pp. 41-55.
Mithas, S., and Krishnan, M. S. 2009. “From Association to Rubin, D. B. 1980. “Bias Reduction Using Mahalanobis-Metric
Causation via a Potential Outcomes Approach,” Information Matching,” Biometrics (36), pp. 293-298.
Systems Research (20:2), pp. 295-313. Sambamurthy, V., Bharadwaj, A., and Grover, V. 2003. “Shaping
Neuberger, C., and Nuernbergk, C. 2010. “Competition, Comple- Agility through Digital Options: Reconceptualizaing the Role of
mentarity or Integration?,” Journalism Practice (4:3), pp. Information Technology in Contemporary Firms,” MIS Quarterly
319-332. (27:2), pp. 237-263.
Nielsen. 2010. “Changing Models: A Global Perspective for Sen, B., Shin, J., and Sudhir, K. 2011. “Demand Externalities from
Paying for Content Online” (http://in.nielsen.com/site/documents/ Co-Location: Evidence from a Natural Experiment,” Working
PaymentforOnlineContent.pdf). Paper, Yale University, New Haven, CT.
Nielsen. 2012. “Nielsen SoundScan Website” (http://nielsen.com/ Shankar, V., Rangaswamy, A., and Pusateri, M. 1999. “The Online
us/en/industries/media-entertainment.html). Medium and Customer Price Sensitivity,” eBusiness Research
Nonnecke, B., and Preece, J. 2000. “Lurker Demographics: Center Working Paper 04-1999, Pennylvania State University,
Counting the Silent,” in Proceedings of Annual ACM Conference College Park, PA.
on Human Factors in Computing Systems, New York, NY: ACM Srinivasan, S. S., Anderson, R., and Ponnavolu, K. 2002. “Cus-
Press, pp. 73-80. tomer Loyalty in E-commerce: An Exploration of its Antece-
O’Reilly, T. 2005. “What is Web 2.0: Design Patterns and Busi- dents and Consequences,” Journal of Retailing (78:1), pp. 41-50.
ness Models for the Next Generation of Software” (http://www. Straub, D., and Watson, R. 2001. “Transformational Issues in
reillynet.com/pub/a/oreilly/tim/news/2005/09/30/ what-is-web- Researching IS and Net-Enabled Organizations,” Information
20.html). Systems Research (12:4), pp. 337-345.
Parameswaran, M., and Whinston, A. B. 2007. “Research Issues in Sun., M., and Zhu, F. 2012. “Ad Revenue and Content Commer-
Social Computing,” Journal of the Association for Information cialization: Evidence from Blogs,” Working Paper (available at
Systems (8:6), pp. 336-350. http://dx.doi.org/10.2139/ssrn.1735696).
Teece, D. J. 2010. “Business Models, Business Strategy and Inno- About the Authors
vation,” Long Range Planning (43:2-3), pp. 172-194.
Villanueva, J., Yoo, S., and Hanssens, D. M. 2008. “The Impact of Gal Oestreicher-Singer is an assistant professor at Tel Aviv
Marketing-Induced vs. Word-of-mouth Customer Acquisition on University’s Recanati Business School. Her research studies the
Customer Equity,” Journal of Marketing Research (45:1), pp. effects of visible networks on electronic markets and the economics
48-59. of digital rights management. Her prior research has been published
Weill. P., and Vitale, M. R. 2001. Place to Space, Boston: at leading journals including Management Science, Information
Harvard Business School Press.
Systems Research, and Journal of Marketing Research. She
Wenger, E. 1998. Communities of Practice: Learning, Meaning
received the 2008 ACM SIGMIS Best Dissertation Award, a
and Identity, Cambridge, UK: Cambridge University Press.
Wheeler , B. C. 2002. “NEBIC: A Dynamic Capabilities Theory European Union Marie Curie Early Career Award, an INFORMS
for Assessing Net-Enablement,” Information Systems Research CIST Best Paper Award, an ICIS Best Overall Paper award, a MSI-
(12:4), pp. 337-345. WIMI User Generated Content Research Competition Award, and
Wiener, Y. 1982. “Commitment in Organizations: A Normative the Google-WPP Marketing Award. She received her Ph.D. from
View,” Academy of Management Review (7), pp. 418- 428. New York University in 2008, and holds degrees in law and
Ye, L. R., Zhang, Y., Nguyen, D. D., and Chiu, J. 2004. “Fee- electrical engineering from the Hebrew University in Jerusalem and
Based Online Services: Exploring Consumers’ Willingness to Tel Aviv University.
Pay,” Journal of Technology and Information Management
(13:2), pp. 134-141. Lior Zalmanson is a doctoral student at Tel Aviv University’s
Yoo, Y. 2010. “Computing in Everyday Life: A Call for Research Recanati Business School. He has a B.Sc. in computer science and
on Experiential Computing,” MIS Quarterly (34:2), pp. 213-231. an M.Sc. in information systems and technology management from
Yoo, Y., and Alavi, M. 2004. “Emergent Leadership in Virtual Tel Aviv University. His experience includes the management of
Teams: What Do Emergent Leaders Do?,” Information and social computing and Web 2.0 initiatives in a large governmental
Organization (14), pp. 27-58. organization in Israel. His research interests include online com-
Zeithaml, V. A. 1988. “Consumer Perceptions of Price, Quality,
munities and economic behavior in virtual environments, in
and Value: A Means-End Model and Synthesis of Evidence,”
particular Internet business models and pricing of digital goods.
Journal of Marketing (52), pp. 2-22.
Zeithaml, V. A., Berry, L. L., and Parasuraman, A. 1996. “The
Behavioral Consequences of Service Quality,” Journal of
Marketing (60), pp. 31-46.
Appendix A
Comparison of Means Before and After Propensity Score Matching