Big Bet Review Fall 2009
Big Bet Review Fall 2009
Big Bet Review Fall 2009
Bernardo Huberman Anupriya Ankolekar Michael Brzozowski Leslie Fine Scott Golder Tad Hogg Gabor Szabo Dennis Wilkinson Fang Wu
Executive Summary
The past decade has witnessed a momentous transformation in the way people use computers and the Internet to interact and exchange information. Content is now coactively produced, shared, classified, and rated on the Web by millions of people. Mobile devices providing easy, continuous access to Web services have become common. Commerce, social networking, opinion formation, and large collaborative efforts are increasingly taking place online. This collective intelligence "cloud" represents a new phenomenon which is as yet poorly understood and poorly leveraged by existing applications and services. Since the cloud also represents great market potential, it is important to have a lively research program in this domain. Our research goal is to improve the value that users get from the collective intelligence cloud in an increasingly mobile and connected world. We will do this by improving our understanding of how information is created, evaluated and consumed online, and by designing, constructing and validating innovative systems which will confer a market advantage to HP. Our proposed project is large and interdisciplinary, with a number of interdependent initiatives. One of the most interesting and daunting consequences of the prevalence of the Web and digital media is that information, which used to be scarce and therefore valuable, is now so ubiquitous so as be almost devoid of monetary value. Search engines, billions of websites, targeted advertisement and easy access to digital content provide us with an overabundant supply of information for our business and entertainment needs. The value has now shifted to users attention and to tools for harvesting useful, contextual, trustworthy knowledge from the flood of information. These areas make up the core of our research program. Specifically, we will investigate the problem of attention allocation in information rich environments and its interplay with content novelty and popularity as well as user history and social standing. To address this problem, we will develop mathematical models for people's interactions with each other and with the available information. We will verify these models by analyzing existing cloud initiatives and by conducting laboratory and online experiments. From these insights we will create algorithms and methods for optimizing the presentation and rating of information in order to maximize its value to both users and providers. In addition, we will pursue a research program in designing and developing services for personalized information access in mobile scenarios. We will develop a number of prototypes for capturing and monitoring personal context and then presenting users with customized, relevant results for their mobile Web information searches. These prototypes will be designed and implemented taking into account the limited attention of the user in mobile environments. We will validate these prototypes with formative and summative user evaluations via field studies in natural mobile environments, and through laboratory experiments. Besides the design of useful prototypes, this work will also yield insights into which aspects of
HP Confidential
Page 1 of 23
users context are most relevant for situation-specific personalization and how these can be effectively exploited. A third focus of our research will be services in the enterprise. We will build services to incentivize and facilitate knowledge sharing and harvesting through mechanisms of attention, status, and reputation. These services will incorporate what we have learned from our analysis, our empirical work, and our experiments. Conversely, experience from actual use of these services will improve our models and suggest further experiments. The research impact of our project will be advancement of the state of the art, papers in high profile conferences, and exposure for HP Labs in this exciting and relevant field. The business impact will be tools and applications for collaboration and knowledge sharing within the enterprise, for providing personalized information services in mobile environments, and for optimizing the presentation of information on a website. The knowledge sharing tool and applications for personalized information in mobile environments will be available for use within HP by employees or as a service HP can provide to customers; algorithms for optimizing website information presentation will be applied to HP's websites and be available as service for customers. We expect that the project will require four years and ten to twelve team members. Specifics of budget and personnel requirements and timeline are presented in sections 4 and 5.
HP Confidential
Page 2 of 23
1 Research Contributions
Motivation
An information ecosystem exists when people who have information ("producers") connect with people who desire information ("consumers"). Ideally, such an ecosystem acts to motivate producers to share and to help consumers identify information that will bring them the most value. Examples include markets where customers buy products that best fulfill their needs, knowledge reuse and sharing within an organization, and commercial media like newspapers where information and advertising compete for visual real estate.
Producers
(people who have info)
n vatio Moti
Consumers
(people who can use info)
Rewa
rd
Feedback
14 March 2008
When the information exposure of individuals is high, an information-rich economy ensues in which there is keen competition for peoples attention (Falkinger 2007). It is clear that we find ourselves in such a regime today, in which information can be created and exchanged anytime and anywhere with great fluidity and ease. This situation presents a challenge for content providers, who need to decide what to prioritize in order to get peoples attention, and for content consumers who attempt to extract value from the flood of available information. A key intermediary, both on the information production side and on the consumption side, is a user's context. Context is based both on transient properties like a user's geographical location, schedule, tasks, and available device modalities, and more persistent features like a user's interests, personal tastes, and social network. Context affects how -- and why -- content gets produced and consumed. The economics of attention, the design of information ecosystems, and the role of context therein thus represent three relevant and interrelated areas of research which we propose to study. A clear understanding of all three will allows us to implement powerful tools which bridge the barriers to perfect sharing of and access to information.
HP Confidential
Page 3 of 23
Context
A significant proportion of the information explosion on the Web comprises personal content and interaction, as people write about personal events, activities, and the places where they happened; upload photos; review restaurants, books, cameras, etc (Horrigan 2007). The world has never been flooded with so much implicit and explicit information about people's actions, preferences, opinions and hopes. In addition to personal content, web pages are being annotated with all kinds of information. For example, many photos are now increasingly annotated with the location where they were taken, web pages on Wikipedia include geographical coordinates of the places they talk about, blogs carry markup about people and their friends and blogroll acquaintances.
HP Confidential
Page 4 of 23
As people do things in the real world and blog about it, or use online services to search for information about a particular place, they are creating traces of their location-related activities online. From an individual's point of view, there are several kinds of traces: personal traces (made by me), social network traces (made by my family, friends and acquaintances), Web community traces (made by people whom I don't know) and potentially even historical event traces. We aim to visualize and exploit these traces of community and individual activity within physical spaces to make them more meaningful and useful to people. For example, when visiting a new place, photos and reviews made by my friends (or, failing that, other people) can be pulled up from the Web. Social traces might be used to show traces of a friend nearby or show a visitor where the lonely and crowded places are in a new city. The overwhelming majority of this information and the way it is presented is not personalized to the users context, impoverishing its potential value. An illustrative and persuasive example is the less-thancompelling mobile Web experience. Attention is at a premium in mobile environments and requires radically different design (Brandt 2007). Context determines what people direct attention to. Several systems have explored using location, a specific component of context, for customized information access, e.g. Geonotes for location-based notes as virtual noticeboards around physical spaces (Persson 2002), InfoRadar to promote group and public interactions in physical spaces (Rantanen 2004), Cybreminder for location-based reminders (Dey 2000) and comMotion for location-based personal notes created and retrieved by speech (Marmasse 2000). These systems were developed independently of the Web, which was until now not reliably available for mobile devices. The recent advance of geotagging (http://en.wikipedia.org/wiki/Geotagging) on the Web of websites, photos, blogs, RSS feeds and the introduction of geographical annotation languages like KML (introduced initially for Google Earth) has vastly increased the extent of location-based information currently available. What is required now are ways to tap the Web as an unparalleled medium for content creation and interaction by making such information accessible on mobile devices in a way that is adapted to and appropriate for the user's context. The mechanism and interface for information access must be designed with consideration to the limited attention people have in mobile environments. Even using contextual information, we expect that attention-scarcity during mobility will require a careful choice of which information items to present to people (Brandt 2007). Although we have focused our discussion on the mobile Web, context is also highly salient for information retrieval on the desktop. The context of information creation, including location, intention, actions, often serves as important retrieval cue for email (Ducheneaut 2004) and documents (Blanc-Brude 2007). However, it is currently poorly used for retrieving and indexing of documents and other information on personal computers.
HP Confidential
Page 5 of 23
lates with effort involved, making it time-consuming to contribute to traditional knowledge bases. But there are lots of other valuable resources that can be shared, many offering a window into people's tacit knowledge. In addition, experts tend to describe their domain knowledge in more abstract conceptual terms, while novices tend to express their questions in more concrete terms (Sternberg 1997). This makes it difficult for information stored in a knowledge base by experts to be located and retrieved by novices. Simple keyword searches are often ineffective in locating relevant resources and people, suggesting that higherlevel topic detection and language processing may be needed. Enterprises are an interesting and relevant microcosm of these issues, because there are often additional organizational disincentives to sharing information (Hinds 2003, Argote 1999, Fisher 1997). Pay-forperformance rewards pit workers against each other, formalized channels for knowledge management are too rigid, and theres often little reward for time expended helping others (Hinds 2003). Such contributions undoubtedly benefit the enterprise but are hard to quantify, and so recognizing meaningful contributions is difficult. A large organization has a multitude of potential people and resources to explore. A new post is made to one of HPs collaborative forums once every five minutes, so effective ways to filter and recommend content and people are required. An important consequence of "Web 2.0" technologies is enabling shift from classical document-centric collaboration to community-centric collaboration. Holbrook et al. (2008) argue that email is only suitable for point-to-point communication, rather than team collaboration. The new generation of computersupported collaboration tools will enable not only knowledge reuse but the support of distributed communities. IBM Research has explored the use of internal bookmark sharing (Millen 2006) and people tagging (Muller 2006). Communities, particularly involving large groups of people with easy entry and exit, face a free rider problem where users are tempted to benefit from the production of others while contributing little or nothing in return (Hardin 1968). If many people choose to free ride, the quality of the content and user feedback decreases dramatically. An example is file sharing on Gnutella (Adar 2000). Status is one powerful approach to overcome this free rider problem (Loch 2000). Status hierarchies are pervasive in groups, and their basis have been attributed to at least four causes functionalism (Bales 1953), exchange theory (Blau 1964), symbolic interactionism (Stryker 1985), and dominance-conflict (Ridgeway 1995). These theories differ in the extent to which status hierarchies are viewed as cooperative or competitive behaviors, and whether they benefit the groups productivity. Nevertheless, people strive for status in a group even at some monetary cost (Huberman 2004). Status is particularly appropriate for online communities with readily available quantitative measures of contribution or consumer ratings, which may be used to highlight the top individual contributors. The success of one of the biggest online community projects, the online encyclopedia Wikipedia, is a major example among voluntary initiatives and has thus attracted considerable research. A Web-based survey uncovered that the two single most important motivations to contribute were fun and ideology ("I think information should be free."), trumping other reasons such as career and social motives ("People I'm close to want me to write/edit in Wikipedia.") (Nov 2007). Altruism as displayed by Wikipedians is rather the exception than the rule, and designed incentive structures for contribution such as comparisons with peers' performance do not always have the anticipated effect either (Harper 2007). Thus it remains an important question to identify conditions where motivations such as altruism and status are sufficient to encourage content creation.
HP Confidential
Page 6 of 23
Attention Allocation: understanding attention allocation in information-rich environments and its interplay with content novelty and popularity, user history and social standing and limited time resources; utilizing this understanding to optimize the presentation of information via dynamically configured displays. Context: understanding how context affects the value of information, particularly in mobile settings; designing and developing services for context-aware personalized information access and creation in mobile scenarios. Facilitating and motivating knowledge sharing: understanding and developing tools to facilitate and incentivize knowledge sharing and harvesting through mechanisms of attention, status, and reputation.
Obtaining relevant (and often sparse) data sets: In spite of rapidly growing online data sets, obtaining just the kind of data required can be difficult. For example, the number of instances of specific information use by particular users or information around specific locations might be small. This is particularly true for social network and location-based data. This sparseness limits the ability to infer niche user interests directly from available data. Supplementing this data with inferences of similarity among users can improve this situation, but requires both accurate models of user similarity and relationship information, such as social networks and context, from user data. This data often confounds multiple relationship types as a single link between users and is only a partial view of how users relate to and influence each other. Thus a major challenge is developing techniques to make best use of the available data. Data cleaning and analysis: In working with real world data, it can be challenging to isolate the relevant effect being studied and control for other variables. Cleaning the data to ensure accurate results is also challenging and time consuming. Location accuracy: For location based services, limited position accuracy (e.g., from cell phone tower locations or GPS) can affect the accuracy of results provided significantly. This is one area where we have limited control, but where we expect the state-of-the-art to improve significantly. Automatically determining (relevant aspects of) context: Although we will have access to lots of data from peoples devices, determining personal context, e.g. activity, and social context, e.g. social network of friends, automatically from this data is quite challenging and will need sophisticated and creative heuristics.
HP Confidential
Page 7 of 23
Design for small displays in mobile context: Mobile devices are faced with particularly limited visual real estate and it is tough to make good interfaces for mobile devices. We will attempt to utilize the physical and situated nature of mobile devices and abstract information representations to design natural and unobtrusive interfaces that make optimal use of the available input/output modalities. Measuring quality of knowledge sharing: We will use various metrics to assess the usefulness of individual contributions, but judging the quality of the contributions and extent of knowledge sharing from a combination of metrics will be a challenge. Organizational adoption and implementation of incentives: Successful adoption of groupware tools relies on a subtle play of network effects and individual need. In addition, exploring and implementing organizational incentives can be rather tricky. We will focus primarily on social and psychological incentives and on organizational appreciation of contributions.
1.4 Approach
To address these issues we propose an interdisciplinary approach combining research in economics, information science, human-computer interaction, statistics and data modeling, and social networks. Our research methodology combines empirical observational studies, analytical modeling, experimental lab studies, social network analysis, and the design and construction of prototypes validated in field studies. Specifically, we will investigate the economics of attention and develop algorithms and methods for optimizing the presentation of information so as to maximize its value to both users and providers. In addition, we will design tools for increasing the value users get from information by augmenting it with spatial, temporal and social contexts. Furthermore, we will investigate and design suitable incentives for people to create and consume content. These tools and incentives also apply to enterprise information systems, with differences arising from restricting participation to members of the organization and, perhaps, their customers and business partners, while respecting the proprietary nature of the information.
A. Attention Allocation
A.1 Economics of Attention
We intend to develop general models of attention, not as a phenomenon that happens in people's heads, but in their interactions with each other and through media. Attention in this sense is measured by the intensity and density of signals that relate to a particular website, article, artifact, review, research program, etc. We also want to understand how attention to novel items propagates and eventually fades among large populations, a key issue for the success of products and ideas. The methodology to be followed will be to develop analytical models with predictive power and to justify them by making measurements on very large datasets consisting of millions of individuals attending to given news or other kinds of media. We will also consider the problem of resource allocation for advertising many products in several websites taking into account exposure levels. We will consider the problem of resource allocation for advertising many products in several websites. We will then show how one can determine the optimal allocation of resources into several websites both in the case of single providers and many competitive ones.
HP Confidential
Page 8 of 23
ited time. Nor has there been an exploration of the explicit tradeoff they make between sufficient information and sufficient effort. These questions are ripe for exploration in our experimental economics lab. Answering these fundamental questions of how people satisfice task solutions given limited time will inform much of our work, in particular dynamic and context-aware configuration of information displays.
HP Confidential
Page 9 of 23
vices that utilize aspects of an individual's context (such as location, interests, tasks etc.) and social network (i.e. communities) in order to provide a unique tailored experience.
HP Confidential
Page 10 of 23
specialized to varying formats of expression allow users to integrate these practices into their daily work. Although we focus on enterprises, many of the insights gained from this work can also be generalized to other distributed collaborative systems, such as Web forums, open source networks, or consumer applications. To answer who knows what or who does what, we will bring all shared resources together under one roof so people can go to a centralized service to monitor their peers shared activity. It then becomes much easier for people to contribute to this system, since they have a variety of media. There are a variety of potential incentives for people to share information:
Economic. People receive some sort of financial reward or bonus. Organizational. People receive higher status within an organization. (Huberman 2004) Social. People feel they are supporting a community or somehow reciprocating by participating, or that they are building up a reputation for themselves. Psychological. People feel good about helping others, showing off their knowledge, and commanding the attention of their peers.
We intend to explore combinations of these incentives to find the right mix to encourage information sharing. Perhaps an additional motivation for people to contribute information is knowing that it will eventually be used for something. We propose to explore these issues by building and evaluating systems for enterprise audiences.
Explicit feedback from information consumers to authors. For instance, comments left on a blog are a form of public validation of a post's value. Implicit feedback from observing diffusion and consumption of a contribution. For example, tracking how many people click on a post, forward it, or save it for later retrieval to identify popular or useful items. Exposing to authors the attention given their content by consumers yields a psychological reward, as described in the previous section. Moderation systems where users vote on the usefulness of an idea or comment. Both submitting content and voting are rewarded, to encourage people to help evaluate posts. To encourage people to evaluate new posts as well, larger rewards are offered for previously-unrated items. Authors of posts or comments that are favorably voted on by peers also receive greater rewards.
HP Confidential
Page 11 of 23
HP Confidential
Page 12 of 23
tually exist that enable them to replicate the process of discovery that works so effectively in stores. Recommendation engine vendors such as Certona and Aggregate Knowledge and social shopping tools like Kaboodle are certainly steps toward resolving this problem. 34% of consumers who noticed recommendations purchased products based on aforementioned recommendations. (North American Technographics Retail And Customer Service Online Survey Q2 2007) Most of the current engine vendors in this space are small and privately held, which makes revenues hard to estimate. However, a recent Forrester Report (Forrester 2007) lists Aggregate Knowledge, Baynote, Certona, Criteo, and CleverSet all at revenues of $3-10m. Nearly every firm in this space is small with only a short track record. A powerful tool released from HP is sure to give larger retailers a larger degree of comfort than one from these fledgling startups.
HP Confidential
Page 13 of 23
effort. As Google, Yahoo!, Nokia and other competitors consolidate and escalate their offerings, this window may not be open for too long. In the following calculations, we make a key assumption, that we only offer our services to devices with GPS. This need not hold, but it is a conservative lower bound on the devices we can service. Our services are usable by people who own mobile phones that are Web-enabled and have GPS. According to Gartner, worldwide mobile phone sales for 2007 are in the order of 1.134 billion units. Of these, today approximately 12% of handsets sold ship with a GPS chip, and that this will increase to nearly 40% by 2010, or roughly 500 million handsets (Gartner 2007). We expect our services to be most easily usable on smartphones. Market data from Gartner shows that smartphones accounted for roughly 8.8% of mobile handset unit sales for 2006, with 83.3 million smartphones sold worldwide. This market is anticipated to grow dramatically as consumers start purchasing smartphones, at a 5-year CAGR of 54.1%, with unit sales reaching 466.4 million by 2010 (Morgan Keegan & Co. 2007). Thus, a conservative estimate of the total available market by 2010 for our services is around 700 million units. We expect our services to be used primarily by young consumers who are seeking information while out and about. A mobile subscriber survey by M:Metrics in March 2007 found that browsing news and information is the most-used mobile Internet application in the US (9.6% of survey respondents), UK (13.3%) and France (7.5%) and close second in Germany and Italy. Assuming very conservatively that this proportion stays constant, it implies that by 2010 around 67 million handsets will be used primarily for searching for information and news. Assuming HP captures 20% of market share as the primary application for information search, this is equivalent to about 13.5 million people. Overall, the market for location-based services was about $149 million in 2006 and is expected to grow to $3.2 billion by 2010, a 116% growth rate, the highest for any mobile service besides mobile video. (Morgan Keegan & Co. 2007) There are several possible business models for providing personalized mobile services. One possibility is to focus on our application interface on the mobile device and accordingly sell the application itself as a software package. Another possibility in line with HPs strategic emphasis on cloud services is to offer personalized contextual information as a subscribed service. The danger here is that, despite the obvious benefits we provide in aggregating and personalizing information, people are used to the free Web experience and might not be willing to pay for it when they are mobile. A variation here would be to tie up with mobile phone carriers, such as Verizon and T-mobile. If we earn $1 per month from T-mobiles 25 million subscribers by offering this service free to them, this translates into an annual revenue stream of $300 million. Note that besides our servers, there is hardly any production cost. Finally, we could rely on the ubiquitous Internet business model of providing the service for free and then selling the attention of our users to advertisers. Assuming people daily make 5 mobile information searches using our context-based services and our prime market is 13.5 million people, this is 67.5 million information searches daily. If we earn about $1 per 1000 searches, this translates to a very conservative estimate of $20.25 million per year from just mobile advertising. The market size for mobile advertising in 2006 was about $33.2 million, but projected to be $4 billion by 2011. (Kelsey Group 2007, IDC 2007). Mobile search advertising sales are expected to balloon from $33.2M this year to $1.4B by 2012 (IDC 2007), so this might be a very profitable strategy. Given that the motivation of our work is that mobile users have scarce attention resources, we should be very careful in how saliently we present advertising to mobile users.
HP Confidential
Page 14 of 23
recommend content, nor is it capable of dealing with custom applications outside the Lotus system. Microsofts Knowledge Network mines expertise and collaboration patterns from users email, but the privacy implications are dire, requiring end users to decide which emails can be indexed from a client-side tool. Other prominent vendors moving into this space include BEA, Oracle, and SAP (Koplowitz 2007). Transferring expertise across an organization is difficult; a 1998 survey found that only 13% of American and European organizations thought they were doing a good job at this (Ruggles, 1998). Meanwhile, consumers have adopted distributed Internet services as a means of sharing and finding information. Web 2.0 services like del.icio.us, Digg, and Facebook enable people to discover resources from their social networks. A whole new generation is entering the workforce expecting to be able to collaborate the same ways at workmore efficiently, rapidly, and at lower cost (Koplowitz 2008). As a result, enterprise social software is projected to be a $3 billion market by 2011 (Radicati, 2007) and to grow by 41% annually over the next four years (Eid, 2007). Worldwide, collaborative applications revenue is projected to be an $8 billion market by 2011, of which $3 billion will be integrated groupware systems (Levitt 2007). Additionally, Gartner (2007) predicts that enterprise content management software will be a $5 billion market by 2011. Forrester reports that in 2008, one in three businesses in North America and Europe is planning to invest in "Web 2.0 tools--namely wikis, blogs, and RSS" (Young 2008). McKinsey reports that nearly half of executives familiar with Web 2.0 technologies are planning to invest in collective intelligence, peer-to-peer networking, or social networking tools (McKinsey 2007). Currently, most firms looking to implement social software are large enterprises of 1000 employees or more (Young 2008). However, the market is still growing and there is still room for innovation, although the window of opportunity is closing. Potential business models for commercialization include offering an integrated social software system as a service in our portfolio for our IT outsourcing customers, or as a software package for companies that manage their own IT. Since much of an enterprise's knowledge is proprietary, IT customers may not want to expose it outside their firewall, implying a market for custom delivery engineering. Eventually it may be possible to consider gateways for companies to export limited sections of relevant content to business partners, providing an opportunity for HP to leverage its broad IT customer portfolio.
3 Team
3.1 Team Members
Bernardo Huberman is a Senior Fellow and Director of the Social Computing Lab. He has worked extensively on the nature and dynamics of the Web as well as in the design of mechanisms for harvesting knowledge from large distributed groups that are in use today. His interest in the ecology of information led him to focus on the economics of attention as one of the key drivers for the production and consumption of information in the web. He holds a Ph.D. in Physics from the University of Pennsylvania. Anupriya Ankolekar is a Visiting Scholar for Semantic Web at HP Labs for the coming year. She will focus on developing personalized services for mobile information access. Her background is in humancomputer interaction, online communities, especially open source software development communities, and Web technologies, especially the Semantic Web and Web services. She has experience in designing and implementing Web systems for collaboration and evaluating them in the field. She received a Ph.D. in human-computer interaction from Carnegie Mellon University. Presently she is a visiting scientist and her appointment ends in April 2009. We intend to keep her as full member of the lab.
HP Confidential
Page 15 of 23
Mike Brzozowski's research foci are social networks, persuasive technology, and computer-supported collaboration. He applies user-centered interaction design and machine learning techniques to building collaborative software systems and adaptive user interfaces. Mike holds an MS degree from Stanford in computer science, specializing in human-computer interaction. Leslie Fine is an applied game theorist, mechanism designer, and experimental economist. Her primary areas of interest are in market design, incentive systems within corporations, information flows, and novel experimental methods. Leslie received her Ph.D. from Caltech, where she studied information market design and incentive compatible mechanisms. Scott Golder has been part of the Information Dynamics Lab for nearly three years, during which time he performed the first quantitative scholarly analysis of social tagging systems. His background is in the study of electronic communities and social hierarchies. He is experienced in the design of collaborative systems, especially for group annotation and information organization. He plans to leave for school in the fall of 2008 and will thus have to be replaced with someone with a similar set of skills. Tad Hogg will focus on modeling content creation, sharing and use in online communities, designing incentive mechanisms and testing them experimentally. He has worked on various projects investigating aggregate behavior of groups in economic contexts, including mechanisms for establishing reputation, adjusting risk behaviors, information aggregation and developing economic applications of emerging technologies such as quantum information processing. He holds a Ph.D. in physics from Stanford. Gabor Szabo's recent research has been centered around networks in various natural systems, whose connections appear random at first but share intrinsically similar statistical properties. Heavy emphasis has been put on social systems (online communities and interpersonal communication networks) where he applied stochastic modeling and computational tools to predict future behavior. He holds a PhD in physics from the Budapest University of Technology. Dennis Wilkinson has focused on quantitative and empirical studies of collaborative and peer production systems. This includes model fitting, theoretical analysis, and algorithm design in a variety of settings including email networks, engineering design efforts, and online communities. His background is physics (Ph.D. Stanford) and he has experience with and a good understanding of a variety of mathematical and statistical methods. Fang Wu's past research focused on social networks, mechanism design and stochastic modeling. He is presently committed to the economics of attention, striving to understand the central role that attention plays in many kinds of information systems. He does both empirical and theoretical work. He holds an M.S. in statistics and a Ph.D. in physics from Stanford University.
HP Confidential
Page 16 of 23
4 Resources Needed
We have outlined a fairly large research program in this proposal. This research requires a dedicated interdisciplinary team of about 10-12 researchers over 4 years that will span the range from analytics to empirical studies.
Upfront costs
For our mobile personalization work, we will require about $5000 for purchasing equipment, such as smartphones and servers for providing context-aware mobile services.
Annual expenses
In order to design and validate new models in our experimental economics lab, we will require a budget for compensating participants. We will begin with small-scale experiments in the lab and then test the more promising ideas via field tests (involving both some software development, running a web site for a while, and rewards for participants). While our needs may vary, we can assume two to three experimental projects per year, each requiring approximately 16 experiments with an average population of 16 students each. Our expected needs are between $40,000 and $60,000 annually. To compensate subjects in other user studies, used to evaluate interaction techniques and system designs, we also need about $4000 annually.
Year 2
Year 3
Year 4
Develop a methodology for maximizing the attach rates of consumers of HP shopping; Apply to the website, run field tests
Integrate the social component with the dynamic display configuration system. Transfer algorithms to commercial websites
Page 17 of 23
Year 2
Year 3
Year 4
B.1
Summative evaluation of context filter in Design and implemen- natural mobile contexts tation of configurable and investigation of context filter which aspects of context are most salient for information searches Early prototypes of peripheral information system Initial design of context manager Design and implementation of context manager Initial design of location-based messaging system, draws from Year 1 work of B.3 Develop of system to present contextually relevant peripheral information; draws from Year 2 of B.1 User studies of effectiveness of context manager Design and implementation of User study of messaging and initial design of ease of mobile location-based aggregacontent creation tion/voting mechanisms Test and implement mechanisms for rewarding contributions. Verify the mechanism with user studies and data analysis. Analyze data from a number of popular websites to test whether attention is an incentive for contributions Build tools to make resources and people findable. Verify the approach with user studies and ethnography. Build tools to deliver customizable, personalized information feeds. Verify the approach with user studies and experiments. Apply reward mechanisms to idea generation and evaluation. Verify the approach with user studies and a large trial. Evaluation of system
Mobile Content B.4 Creation (3 PY) Rewarding subC.1 stantive contributions (2 PY) Attention as Incentive for Peer C.2 Production (2 PY) Expertise Location (4 PY)
C.3
C.4
HP Confidential
Page 18 of 23
Year 2
A.1: Apply methodology for maximizing the attach rates of consumers of HP shopping. A.3: Verified stochastic models to describe observed behavior (both for social linking and influence propagation). A high-level algorithm to optimize the attention generated from an information display with space constraint. B.1: Summative evaluation of context filter in natural mobile contexts and investigation of which aspects of context are most salient for information searches. B.2: An early prototype of context-based peripheral information system. B.3: A proof-of-concept implementation of a context manager for personal information devices. B.4: Initial design of location-based messaging system, draws from Year 1 work of B.3. C.3: Tools to deliver customizable, personalized information feeds, which have been developed using formative user studies and experiments.
Year 3
A.3: Identification of characteristics of online communities that show signs of extraordinary (positive or negative) influence and anticipatory algorithms to detect and utilize unexpected dynamic spreading patterns from early times. B.2: A proof-of-concept system to present peripheral information to the user in mobile contexts. Develop of system to present contextually relevant peripheral information; draws from Year 2 of B.1 B.3: User studies of effectiveness of context manager. B.4: Design and implementation of messaging and initial design of location-based aggregation/voting mechanisms. C.1: Mechanisms for rewarding contributions developed and verified through user studies and data analysis on the tools developed in Year 1 and 2. C.2: Investigation of attention as incentive for contributions via data analysis from popular websites.
Year 4
A.3: Dynamic display configuration system with integrated social component developed in Year 3. B.2: Large-scale field study of developed context-aware personalization services in mobile environments. B.4: User study of ease of mobile content creation . C.4: Refinement reward mechanisms for idea generation and evaluation, which have been verified via user studies and a large field trial.
HP Confidential
Page 19 of 23
6 Metrics
In the 4 years of our research program, we will primarily measure our progress in two ways. One is through publications at select conferences and in select journals in the following areas:
Human-computer interaction: ACM SIGCHI Conference on Computer-Human Interaction (CHI), International conference on Mobile Human Computer Interaction (Mobile HCI), International Conference on Ubiquitous Computing (Ubicomp), International Conference on Intelligent User Interfaces (IUI), IEEE Pervasive Computing Information systems: ACM Conference on Computer-Supported Collaborative Work (CSCW), IEEE International Conference on Computer and Information Technology (CIT), European Conference on Information Systems (ECIS), International Conference on Information Systems (ICIS) Web science: ACM World Wide Web Conference (WWW), IEEE/WIC/ACM International Conference on Web Intelligence (WI), Journal of Web Semantics, ACM Conference on E-commerce (EC), International Conference on E-Commerce (ICEC)
Secondly, we expect to develop a number of prototypes embodying our mechanisms, algorithms and concepts. Our research in building systems is also likely to lead to the creation of intellectual property for HP. Accordingly, we expect to file a number of invention disclosures and patents. Naturally, we intend to transfer our algorithms and systems to HP business units as appropriate and as soon as feasible.
Bibliography
Adar, E. and Huberman, B A., Free Riding on Gnutella, First Monday 5(10) Oct. 2000 Ahern, Shane, Dean Eckles, Nathan Good, Simon King, Mor Naaman and Rahul Nair, Over-exposed? Privacy Patterns and Considerations in Online and Mobile Photo Sharing, Proceedings of CHI 2007, April-May 2007, San Jose, CA, USA. Bales, R., 1953. The equilibrium problem in small groups. In: Parsons, T., Bales, R.F., Shils, E.A. (Eds.), Working Papers in the Theory of Action. Free Press, Glencoe, IL, pp. 111161. Bikhchandani et al., 1992; A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades, J. of Political Economy 100:992 Blanc-Brude, Tristan and Dominique Scapin, What do People Recall about their Documents? Implications for Desktop Search Tools, In Proceedings of IUI 2007, January 2007, Hawaii, USA. Blau, P., 1964. Exchange and Power in Social Life. Wiley, New York. Brandt, Joel, Noah Weiss and Scott Klemmer, Designing for Limited Attention, Proceedings of CHI 2007, AprilMay 2007, San Jose, CA, USA. Brin, D., The Transparent Society: Will Technology Force Us to Choose Between Privacy and Freedom?, 1999 Brown, J. and P. Duguid, Organizing Knowledge, California Management Review, 40, 3 (1998), 90-111. Brzozowski, M., T Hogg and G Szabo, Friends and Foes: Ideological Social Networking, in Proc of ComputerHuman Interaction 2008 Chen K., Fine L. R. and Huberman B. A., Eliminating Public Knowledge Biases in Information Aggregation Mechanisms, Management Science, 50 (2004), 983-994. Chen K., Fine L. R. and Huberman B. A., Predicting the Future, Information Systems Frontiers, 5, 1 (2003), 47-61.
HP Confidential
Page 20 of 23
Chen, K. and T. Hogg, Aggregating Diffuse Information with Subgroups, Proc. of IADIS International Conference on e-Commerce, pp. 45-54, 2004 Chen, K. and T. Hogg, Experimental Evaluation of an eBay-Style Self-Reporting Reputation Mechanism, Proc. of the Workshop on Internet and Network Economics, pp. 434-443, 2005. Cho, J. and S. Roy, Impact of search engines on page popularity, Proceedings of the World-Wide Web Conference, 33, 7698, 2004. Davenport, T. H. and Prusak L. Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, 1998. Dey, Anind and Gregory Abowd, CybreMinder: A Context-Aware System for Supporting Reminders, In HUC 2000 Proceedings, LNCS 1927, pp. 172--186, Springer Verlag, 2000. Domshlak, C., R. I. Brafman, S. E. Shimony, Preference-Based Configuration of Web Page Content, IJCAI, 2001. Driver E. and R. Koplowitz. IBM Or Microsoft For Collaboration -- Or Both?, Forrester. August 6, 2007. Ducheneaut, Nicholas and Leon Watts, In Search of Coherence: A Review of Email Research, Human-Computer Interaction, Vol. 20, No. 1&2: pages 11-48, 2005. Eid T. and Drakos N., The Emerging Enterprise Social Software Marketplace, Gartner Group, gartner.com, 2007. Falkinger, J., Attention economies, Journal of Economic Theory, 127:266-294 (2007) Forrester, Which Personalization Tools Work For eCommerce And Why, 2007. Gartner, Forecast: Enterprise Content Management Software, Worldwide, May 14, 2007. Groves, T and J O. Ledyard. Optimal allocation of public goods: A solution to the free rider problem. Econometrica, 45:783809, 1977. Hahn and Tetlock, Information Markets: A New Way of Making Decisions, AEI Press, 2006. Halvey, M., M. T. Keane, and B. Smyth, Mobile web surfing is the same as web surfing, Communications of the ACM, 49, 3, 2006. Hardin, G., The Tragedy of the Commons, Science 162:1243 (1968) Harper, M., X. Li, Y. Chen, J. Konstan, Social Comparisons to Motivate Contributions to an Online Community, Persuasive Technology 2007. Hinds, P. J. and Pfeffer J. Why Organizations Don't 'Know What They Know': Cognitive and Motivational Factors Affecting the Transfer of Expertise. Ackerman M. S., Pipek V. and Wulf V. eds. Sharing Expertise: Beyond Knowledge Management. MIT Press, Cambridge, MA, 2003, 3-26. Ho and Chen, 2007; New Product Blockbusters: The Magic and Science of Prediction Markets, California Management Review 50:144 Hogg, T. and L. Adamic, Enhancing Reputation Mechanisms via Online Social Networks, Proc. of the 5th ACM Conference on Electronic Commerce, pp. 236-237, 2004 Hogg, T, D M Wilkinson, G Szabo, and M Brzozowski, Multiple Relationship Types in Online Communities and Social Networks, in Proc. of AAAI Symposium on Social Information Processing 2008
HP Confidential
Page 21 of 23
Holbrook J, B Kotlyar, and J Edwards, Community-Centric Collaboration Heats Up, Yankee Group, February 2008. Holt and Laury, 2002; Risk Aversion and Incentive Effects, American Economic Review 92:1644 Horrigan, John, Seeding The Cloud: What Mobile Access Means for Usage Patterns and Online Content, Pew Internet & American Life Project, March 2008. Huberman B and T Hogg, Protecting Privacy While Revealing Data, Nature Biotechnology 20, 332 (2002) Huberman, B, M. Franklin and T. Hogg, Enhancing Privacy and Trust in Electronic Communities, in Proc. of the ACM Conf. on Electronic Commerce (EC99), pp. 78-86, 1999 Huberman, B, C. H. Loch and A. Onculer, Status as a Valued Resource, Social Psychology Quarterly 67:103-114 (2004) IDC, Worldwide Converged Mobile Device 20072011, Forecast Update: December 2007. Jones, Matt, Buchanan, George, Harper, Richard, and Xech, Pierre-Louis, Questions Not Answers: A Novel Mobile Search Technique, Proceedings of CHI 2007, April-May 2007, San Jose, CA, USA. Koplowitz, R., The Big Vendors Converge on Enterprise Web 2.0, Forrester, October 3, 2007. Koplowitz, R. and E. Driver, Walking The Fine Line Between Chaos and Control In The World of Enterprise Web 2.0, Forrester, February 21, 2008. Kuflik, Tsvi, Sheidin, Julia, Jbara, Sadek, Goren-Bar, Dina, Soffer, Pnina, Stock, Oliviero and Zancanaro, Massimo, Supporting Small Groups in the Museum by Context-Aware Communication Services, In Proceedings of IUI 2007, January 2007, Hawaii, USA. Levitt M. Worldwide Collaborative Applications 2007-2011 Forecast, IDC report #206167, 2007. Loch,C, B. A. Huberman and S. Stout, Status Competition and Performance in Work Groups, J. of Economic Behavior and Organization 43:35-55 (2000) Look, Gary and Schrobe, Howard, Towards Intelligent Mapping Applications: A Study of Elements Found in Cognitive Maps, In Proceedings of IUI 2007, January 2007, Hawaii, USA. Marmasse, Natalia and Schmandt, Chris, Location-aware information delivery with comMotion, In HUC 2000 Proceedings, LNCS 1927, pp. 157--171, Springer Verlag, 2000. McKinsey, How businesses are using Web 2.0, McKinsey Quarterly, 2007. Millen D. R., Feinberg J. and Kerr B. Dogear: Social bookmarking in the enterprise. In CHI '06: Proc. of the SIGCHI conf. on Human Factors in computing systems. (Montreal, Quebec). ACM, New York, NY, 2006, 111-120. Morgan Keegan & Co., What Happens When Wireless Truly Becomes The Next Computing Platform?, September 2007. Muller M. J., Ehrlich K. and Farrell S. Social Tagging and Self-Tagging for Impression Management. TR06-02. IBM Research, Cambridge, MA, 2006. Nov, Oded, What motivates Wikipedians?, Communications of the ACM 50, 60--64 (2007). Pandey, S., S. Roy, C. Olston, J. Cho and S. Chakrabarti, Shuffling a stacked deck: the case for partially randomized ranking of search engine results, VLDB 2005.
HP Confidential
Page 22 of 23
Persson, Per, Espinoza, Fredrik, Fagerberg, Petra, Sandin, Anna and Coester, Rickard, Geonotes: A Location-based Information System for Public Spaces, In Kristina Hoeoek, David Benyon and Alan Munro (eds.), Designing Information Spaces: The Social Navigation Approach, Springer, pp. 151--173, 2002. Ritter, Mike, personal communication Salganik et al., 2006; Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market, Science 311:854 Stryker, S., Statham, A., 1985. Symbolic interaction and role theory. In: Lindzey, G., Aronson, E. (Eds.), The Handbook of Social Psychology, Vol. 1, 3rd Edition. Random House, New York, pp. 311378. James Surowiecki, 2004: The Wisdom of Crowds Radicati. Business Social Software Market, 2007-2011. Radicati Group, radicati.com, 2007. Rantanen, Matti, Oulasvirta, Antti, Blom, Jan, Tiitta, Sauli and Maentylae, Martti, InfoRadar: Group and Public Messaging in the Mobile Context, Proceedings of NordiCHI 2004, October 2004, Tampere, Finland. Ridgeway, C., Walker, H.A., 1995. Status structures. In: Cook, K., Fine, G., House, J. (Eds.), Sociological Perspectives on Social Psychology. Allyn & Bacon, Newton, MA. Ruggles R. The State of the Notion: Knowledge Management in Practice. Calif. Manage. Rev., 40, 3 (1998), 80-89. Sternberg R. J. Cognitive Conceptions of Expertise. Feltovich P. J., Ford K. M. and Hoffman R. R., eds. Expertise in Context: Human and Machine. MIT Press/AAAI Press, Cambridge, MA, 1997, 149-162. Wilkinson, D. M. and Huberman B. A., Assessing the value of cooperation in Wikipedia, First Monday 12 (4), 2007 Wolfers and Zitzewitz, 2006; Five Open Questions about Prediction Markets, Federal Reserve Bank of San Francisco working paper Wu F. and Huberman B. A. Novelty and Collective Attention. Proc. Nat'l Acad. Sci., 104, 45 (2007), 17599-17601. Wu F. and Huberman B. A. How public opinion forms, January 25, 2008. Young, G., The Web 2.0 Buyer Profile: 2008, Forrester, February 6, 2008.
HP Confidential
Page 23 of 23