4 Lerman

1) The document discusses social filtering on the social news aggregator site Digg, which allows users to submit links, vote on stories, and designate other users as friends. 2) It analyzes data collected from Digg to show that social filtering is effective - users tend to like and support stories submitted by their friends and stories their friends engaged with. 3) Social filtering provides a new paradigm for users to access personalized information based on what their friends find interesting, rather than actively searching or subscribing to topics themselves. This suggests social filtering has potential for new recommendation algorithms.

Uploaded by

zzztimbo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views2 pages

4 Lerman

Uploaded by

zzztimbo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Social Networks and Social Information Filtering on Digg∗

‡
Kristina Lerman
University of Southern California
Information Sciences Institute
4676 Admiralty Way
Marina del Rey, California 90292
[email protected]

Abstract suggest new products or documents simply based on whether

The new social media sites—blogs, wikis, Flickr and Digg, the user’s designated friends found these products or docu-
among others—underscore the transformation of the Web to ments interesting. Researchers in the past have recognized
a participatory medium in which users are actively creating, that social networks present in the user base of the recom-
evaluating and distributing information. Digg is a social news mender system can be induced from the explicit and implicit
aggregator which allows users to submit links to, vote on and declarations of user interest, and that these social networks
discuss news stories. Each day Digg selects a handful of sto- can in turn be used to make new recommendations [2].
ries to feature on its front page. Rather than rely on the The new social media sites, such as the social news aggre-
opinion of a few editors, Digg aggregates opinions of thou- gator Digg,1 allow users to explicitly build social networks by
sands of its users to decide which stories to promote to the designating others as friends. Tracking activities of friends is
front page. Digg users can designate other users as “friends” common feature in many social media sites and is one of the
and easily track friends’ activities: what new stories they sub- major draws attracting users to these sites. It offers a new
mitted, commented on or read. The friends interface acts as paradigm for interacting with information—social filtering.
a social filtering system, recommending to user stories his or Rather than actively searching for new interesting content,
her friends liked or found interesting. By tracking the votes or subscribing to a set of predefined topics, users can now
received by newly submitted stories over time, we showed put other people to task of finding and filtering information
that social filtering is an effective information filtering ap- for them. We show that social networks are being used on
proach. Specifically, we showed that (a) users tend to like Digg for social filtering. Specifically, we show that Digg users
stories submitted by friends and (b) users tend to like stories tend to be interested in the news stories their friends find in-
their friends read and liked. Social filtering is a promising teresting. Although social filtering, as practiced by Digg, has
new technology that can be used to personalize and tailor in- recently come under fire for being susceptible to “gaming,”
formation to individual users: for example, through personal we believe it to be a promising technology that will lead to
front pages. new generation of personalization and recommendation algo-
rithms.
Keywords
Social Network analysis; collaborative filtering; social filtering 2. Structure of Digg
Digg’s functionality is very simple: Users submit links to sto-
1. Introduction ries they find online, and other users vote on these stories.
Many Web sites that provide information (or sell products When a story gets enough positive votes, or diggs, it is pro-
or services) use collaborative filtering technology to suggest moted to the front page. The front page is what users see on
relevant documents (or products and services) to its users. the Digg home page, while the newly submitted stories are
Collaborative filtering-based recommendation systems [1] try less visible, being “hidden” in the Upcoming stories pages.
to find users with similar interests by comparing their opin- Digg also allows users to designate other users as friends and
ions about products. They will then suggest new products makes it easy to track friends’ activities. A section of Digg’s
that were liked by other users with similar opinions. Recom- home page summarizes the number of stories the friends have
mender systems based on social filtering, on the other hand, submitted, commented on or liked recently.
∗A full version of this paper is available at Each day Digg selects a handful of stories to feature on its
heaving trafficked front page. Although the exact formula for
arxiv.org/abs/cs.HC/0612046
‡This research is based on work supported in part by how a story is promoted to the front page is kept secret, so as
to prevent users from “gaming the system,” it appears to take
the National Science Foundation under Award Nos. IIS- into account the number of diggs a story gets and the rate at
0535182 and IIS-0413321. We are grateful to Dipsy Kapoor
for helping with data analysis, and to Fetch Technologies which it gets them. The mechanism by which the stories are
(http://www.fetch.com) for providing wrapper building and promoted, therefore, does not depend on the decision of one
execution tools. or few editors, but emerges from the activities of many users.
In order to study the role of social networks in filtering,
we tracked both new and front page stories in the technology
1
ICWSM’2007 Boulder, Colorado, USA http://digg.com/technology
30 900
category. We collected data in May 2006 by scraping Digg site in all diggs in first 25 diggs reverse friends
with the help of Web wrappers, created with tools provided

number friends who dugg story

25
by Fetch Technologies. We extracted 195 front stories. For

number reverse friends

each story, we extracted the submitter’s name, story title, 20 600
time submitted, number of diggs the story received and the
15
list of the first 216 users who dugg the story (15, 742 unique
users total). We also collected information about the top 10 300
1020 ranked users. For each user, we extracted the list of
friends and reverse friends or “people who have befriended 5
this user.”
0 0
1 98 195
100,000
stories (sorted)
a
10,000 b Fig. 2: Number of diggers who are also among the reverse
number reverse friends+1

friends of the user who submitted the story

1,000

highly unlikely.
100 Moreover, users digg stories submitted by their friends very
quickly. The heavy solid line in Figure 1(b) shows the number
10 of reverse friends who were among the first 25 diggers. The
probability that these numbers could have been observed by
chance is even less—P = 0.003. We conclude that users digg
1
stories their friends submit. A consequence of this conclusion
1 10 100 1,000
is that users with active and large social networks are more
number friends+1
successful in getting their stories promoted to the front page.
Fig. 1: Scatter plot of the number of friends vs reverse friends We believe that this explains the success of top users.
for the top 1020 Digg users. Two of the biggest celebrities,
kevinrose and diggnation, are marked a and b 3.2 Users digg stories their friends digg
Do social networks also help users discover interesting stories
that were submitted by unknown users? In other words, do
3. Social filtering on Digg users digg stories their friends like?
We looked at the 25 diggs that came after the first m diggs
To show that Digg users take advantage of the Friends inter- to see how many came from friends of the first m diggers.
face to filter the tremendous number of new submissions, we Of the stories posted posted by “unknown” users, ten were
analyze two sub-claims: (a) users digg stories their friends dugg by submitter’s reverse friends (p = 0.005). After five
submit, and (b) users digg stories their friends digg. more diggs (m = 6), 75 became visible to others through the
Note that the “friend” relationship is not symmetric: if user friends interface, and of these 23 (p = 0.028) were dugg by
A designates user B as a friend, user A can keep track of user friends. After 15 users dugg the story, 94 are now visible
B’s activities, but not vice versa. This makes A the reverse and 37 (p = 0.060) are dugg by friends. After 25 diggs, all
friend of B. Figure 1 shows the scatter plot of the number of 96 stories were visible, and almost half of these were dugg
friends vs reverse friends of the top 1020 Digg users as of May by friends (p = 0.077). The probabilities that these many
2006. Black symbols correspond to the top 33 users. For the friends could have dugg the story by chance are above the
most part, users appear to take advantage of Digg’s social 0.05 significance level for after 25 diggs, possibly reflecting
networking feature, with the top users having bigger social the story’s increased visibility on the front page. Although
networks. the effect is not quite as dramatic as one in the previous
section, we believe that the data shows that users do use the
3.1 Users digg stories their friends submit
friends interface to find new interesting stories.
We compare the list of users who dugg the story, or any por-
tion of it, with the list of reverse friends of the submitter. Sub-
mitter’s name is the first on the list. Figure 2 shows the num-
References
ber of diggers who are also among the reverse friends of the [1] J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker,
submitter. Dashed line shows the size of the social network L. R. Gordon, and J. Riedl. GroupLens: Applying
(number of reverse friends) of the submitter. More than half collaborative filtering to Usenet news. Communications
of the stories (102) were submitted by users with one or more of the ACM, 40(3):77–87, 1997.
reverse friends, and the rest by unknown users. We use simple [2] S. Perugini, M. Andr Gonalves, and E. A. Fox.
combinatorics to compute the probability that k of the sub- Recommender systems research: A connection-centric
mitter’s friends could have dugg the story purely by chance. survey. Journal of Intelligent Information Systems,
The probability that after picking n = 215 users randomly 23(2):107 – 143, September 2004.
from a pool of N = 15, 742 you end ` ´up with k that came
from a group of size K is P (k, n) = nk (p)k (1 − p)n−k , where
p = K/N . Using this formula, the probability (averaged over
stories dugg by at least one friend) that the observed numbers
of friends dugg the story by chance is P = 0.005, making it

T 1676029543 Present Continuous Games and Activities Kids A2 b1 - Ver - 2
No ratings yet
T 1676029543 Present Continuous Games and Activities Kids A2 b1 - Ver - 2
1 page
A Comparison of Approaches To Large-Scale Data Analysis
No ratings yet
A Comparison of Approaches To Large-Scale Data Analysis
14 pages
AIprog
No ratings yet
AIprog
120 pages
Data R
No ratings yet
Data R
3 pages
Digital Marketing Syllabus
No ratings yet
Digital Marketing Syllabus
3 pages
Steady State
No ratings yet
Steady State
18 pages
Aakash Chemistry Modules @ParadiseCbse
No ratings yet
Aakash Chemistry Modules @ParadiseCbse
240 pages
The Exception by Lauren H. Mae, Rosalie Rooks - Z-Library
No ratings yet
The Exception by Lauren H. Mae, Rosalie Rooks - Z-Library
1 page
Account
100% (5)
Account
183 pages
Linkedin Growth C DrStorm 1696884463
No ratings yet
Linkedin Growth C DrStorm 1696884463
15 pages
Agency Services
No ratings yet
Agency Services
4 pages
Aman's Web
No ratings yet
Aman's Web
9 pages
Modificaciones Ford Type - 9 - Gearbox - Tutorial-Page - 3
No ratings yet
Modificaciones Ford Type - 9 - Gearbox - Tutorial-Page - 3
66 pages
Social Websites
No ratings yet
Social Websites
18 pages
6th Central Pay Commission Salary Calculator
100% (436)
6th Central Pay Commission Salary Calculator
15 pages
NP EX 7 Syrmosta VarunChaudhary
No ratings yet
NP EX 7 Syrmosta VarunChaudhary
79 pages
DBD 2024 02 06 2024 03 07
No ratings yet
DBD 2024 02 06 2024 03 07
7 pages
How To Research, Pitch - Land Guest Posts
No ratings yet
How To Research, Pitch - Land Guest Posts
5 pages
Information and Communication Technologies
No ratings yet
Information and Communication Technologies
2 pages
Save Doc
No ratings yet
Save Doc
2 pages
The Ultimate Money Machine
100% (3)
The Ultimate Money Machine
9 pages
Noah Bryant
No ratings yet
Noah Bryant
2 pages
Social Networks Science Design, Implementation, Security, and Challenges From Social Networks Analysis To Social Networks Intelligence
No ratings yet
Social Networks Science Design, Implementation, Security, and Challenges From Social Networks Analysis To Social Networks Intelligence
15 pages
Chapter 15
No ratings yet
Chapter 15
13 pages
Unit - Ii
No ratings yet
Unit - Ii
48 pages
Mis CHP4
No ratings yet
Mis CHP4
17 pages
SMA Module 5
No ratings yet
SMA Module 5
71 pages
Social Media Recommendation Based On People and Tags: Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, Erel Uziel
No ratings yet
Social Media Recommendation Based On People and Tags: Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, Erel Uziel
8 pages
Free Website Traffic Iifrp
No ratings yet
Free Website Traffic Iifrp
4 pages
Chatgpt: Unit Iv: Social Information Filtering and Social Media Strategy
No ratings yet
Chatgpt: Unit Iv: Social Information Filtering and Social Media Strategy
5 pages
Sma CH-5
No ratings yet
Sma CH-5
35 pages
Grade 11 Tle Ict Online Platforms in Ict Content Devt
No ratings yet
Grade 11 Tle Ict Online Platforms in Ict Content Devt
4 pages
Digital Marketing Brochure 2020
No ratings yet
Digital Marketing Brochure 2020
16 pages
R Programming
100% (8)
R Programming
60 pages
Now and Get: Best VTU Student Companion You Can Get
No ratings yet
Now and Get: Best VTU Student Companion You Can Get
4 pages
High Quality Backlinks
No ratings yet
High Quality Backlinks
80 pages
Scribd - Căutare Google
No ratings yet
Scribd - Căutare Google
4 pages
St. Andrews Institute of Technology & Management: Digital Marketing
No ratings yet
St. Andrews Institute of Technology & Management: Digital Marketing
11 pages
Editorial Guidelines Template - Shared
No ratings yet
Editorial Guidelines Template - Shared
4 pages
Week 1 Introduction To Information and Communication Technology (Ict)
No ratings yet
Week 1 Introduction To Information and Communication Technology (Ict)
37 pages
L1 Intro To ICT
No ratings yet
L1 Intro To ICT
29 pages
Module-5 SMA
No ratings yet
Module-5 SMA
43 pages
Digital Marketing Course Syllabus
No ratings yet
Digital Marketing Course Syllabus
13 pages
An Introduction To Google Classroom - Presenation
No ratings yet
An Introduction To Google Classroom - Presenation
58 pages
Social Media - A Review
No ratings yet
Social Media - A Review
9 pages
Seam Reference
100% (10)
Seam Reference
320 pages
S .Sam 43 - MP3 Download, Play, Listen Songs - 4shared
No ratings yet
S .Sam 43 - MP3 Download, Play, Listen Songs - 4shared
5 pages
Google's 200 Ranking Factors - The Complete List PDF
No ratings yet
Google's 200 Ranking Factors - The Complete List PDF
40 pages
Savage Garden - Trully Madly Deeply
0% (1)
Savage Garden - Trully Madly Deeply
10 pages
1 Introduction To Information and Communication Technologies
No ratings yet
1 Introduction To Information and Communication Technologies
13 pages
EmpTech Reviewer
No ratings yet
EmpTech Reviewer
37 pages
Content Filtering of Social Media Sites Using Mach
No ratings yet
Content Filtering of Social Media Sites Using Mach
8 pages
Top 15 Most Popular News Websites
No ratings yet
Top 15 Most Popular News Websites
2 pages
The $25,000,000,000 Eigenvector The Linear Algebra Behind Google
No ratings yet
The $25,000,000,000 Eigenvector The Linear Algebra Behind Google
11 pages
Support Vector Machines
100% (5)
Support Vector Machines
14 pages
Cuil Can It Beat Google
No ratings yet
Cuil Can It Beat Google
4 pages
To Go Menu
100% (2)
To Go Menu
3 pages
Google and The Mind
100% (1)
Google and The Mind
8 pages
Lesson 1 Introduction To ICT
No ratings yet
Lesson 1 Introduction To ICT
53 pages
Swopper Manual
100% (1)
Swopper Manual
16 pages
Academic Calendar 2007-2009: June 2007
100% (1)
Academic Calendar 2007-2009: June 2007
10 pages
AI Machine Learning
No ratings yet
AI Machine Learning
2 pages
Jmeter Distributed Testing Step by Step
No ratings yet
Jmeter Distributed Testing Step by Step
4 pages
Automatic Meaning Discovery Using Google
No ratings yet
Automatic Meaning Discovery Using Google
31 pages
Create Your Own Lists of Links
No ratings yet
Create Your Own Lists of Links
55 pages
That Couldnt Happen To Us
No ratings yet
That Couldnt Happen To Us
19 pages
ADSegment IPSec W2K
100% (1)
ADSegment IPSec W2K
80 pages
1.1. Introduction To The Internet 1.1.1. What Happens On The Internet
No ratings yet
1.1. Introduction To The Internet 1.1.1. What Happens On The Internet
116 pages
Linda Implementations in Java For Concurrent Systems: G. C. Wells, A. G. Chalmers and P. G. Clayton
No ratings yet
Linda Implementations in Java For Concurrent Systems: G. C. Wells, A. G. Chalmers and P. G. Clayton
19 pages
Knoblock00 Deb
No ratings yet
Knoblock00 Deb
10 pages
Google'S Mapreduce Programming Model - Revisited: Ralf L Ammel
No ratings yet
Google'S Mapreduce Programming Model - Revisited: Ralf L Ammel
42 pages
MapReduce: Simplified Data Processing On Large Clusters
100% (1)
MapReduce: Simplified Data Processing On Large Clusters
13 pages
Module 1 To 6
No ratings yet
Module 1 To 6
257 pages
Lesson 1 - Introduction To ICT
No ratings yet
Lesson 1 - Introduction To ICT
52 pages
Social Network 1.synopsis
No ratings yet
Social Network 1.synopsis
45 pages
Presented By:: Glenn L. Tabucanon, Phd. T.M
No ratings yet
Presented By:: Glenn L. Tabucanon, Phd. T.M
53 pages
Dynamo: Amazon's Highly Available Key-Value Store
No ratings yet
Dynamo: Amazon's Highly Available Key-Value Store
16 pages
Empowerment Technologies: By: Rosalie Lujero
No ratings yet
Empowerment Technologies: By: Rosalie Lujero
37 pages
Index: See Also
No ratings yet
Index: See Also
8 pages
Social Media As Information Sources
No ratings yet
Social Media As Information Sources
17 pages
Empowerment Technologies Lesson 1 Introduction To ICT
No ratings yet
Empowerment Technologies Lesson 1 Introduction To ICT
23 pages
I Ct:Ani NT Roduct I On
No ratings yet
I Ct:Ani NT Roduct I On
75 pages
Social Media A Phenomenon To Be Analyzed - Danah Boyd
No ratings yet
Social Media A Phenomenon To Be Analyzed - Danah Boyd
2 pages
Mod 7 - Social Media
No ratings yet
Mod 7 - Social Media
6 pages
Internet Based Social Networking Services
No ratings yet
Internet Based Social Networking Services
3 pages
Week 1 Empowerment Technologies
No ratings yet
Week 1 Empowerment Technologies
11 pages
SOCIAL MEDIA PLATFORMS-team Structure-Team-Final
No ratings yet
SOCIAL MEDIA PLATFORMS-team Structure-Team-Final
5 pages
CHP 7
No ratings yet
CHP 7
38 pages
Worksheet in ETECH Grade 11
No ratings yet
Worksheet in ETECH Grade 11
4 pages
Facebook
No ratings yet
Facebook
15 pages
Module 1 To 6
No ratings yet
Module 1 To 6
257 pages
Study On Applications of Information Filtering/ Retrieval Algorithms in Social Network
No ratings yet
Study On Applications of Information Filtering/ Retrieval Algorithms in Social Network
5 pages
Institute of Computer Science and Information Technology DE: Social Media Political Campaign Tool
No ratings yet
Institute of Computer Science and Information Technology DE: Social Media Political Campaign Tool
12 pages
Social Network
No ratings yet
Social Network
12 pages
A Cohesion Based Friend Recommendation System
No ratings yet
A Cohesion Based Friend Recommendation System
16 pages
Implementing Filtered Wall in Online Social Networking Site
No ratings yet
Implementing Filtered Wall in Online Social Networking Site
26 pages
Crowdsourcing: Team B
No ratings yet
Crowdsourcing: Team B
20 pages
Mediator in Social Network For User Interest Activity in Big Data
No ratings yet
Mediator in Social Network For User Interest Activity in Big Data
5 pages
(34-38) Face Location - A Novel Approach To Post The User Global Location
No ratings yet
(34-38) Face Location - A Novel Approach To Post The User Global Location
5 pages
Syllabus Revision: Class 15, "Facebook, Facebook, Facebook": Key Question
No ratings yet
Syllabus Revision: Class 15, "Facebook, Facebook, Facebook": Key Question
8 pages
Final Project Report
33% (3)
Final Project Report
68 pages
L1a Introduction To Information and Communication Technology
100% (2)
L1a Introduction To Information and Communication Technology
65 pages
Module 1 To 6
No ratings yet
Module 1 To 6
257 pages
Final Project Report 130326035550 Phpapp01
No ratings yet
Final Project Report 130326035550 Phpapp01
68 pages
QUOTES On WEB 2.O
No ratings yet
QUOTES On WEB 2.O
14 pages
Survey Recomender System Algorithm
No ratings yet
Survey Recomender System Algorithm
33 pages
Social Networking Sites in A NutShell
No ratings yet
Social Networking Sites in A NutShell
12 pages
FB Documentation
No ratings yet
FB Documentation
21 pages
Social Networking Websites: Jessica Van Hattem David Jia Celete Kato Won Shim Ami Tian
No ratings yet
Social Networking Websites: Jessica Van Hattem David Jia Celete Kato Won Shim Ami Tian
12 pages
Social Networking Websites: Jessica Van Hattem David Jia Celete Kato Won Shim Ami Tian
No ratings yet
Social Networking Websites: Jessica Van Hattem David Jia Celete Kato Won Shim Ami Tian
12 pages
Ocial Etworking: Prepared by - Stewart, Ankit, Bhavesh, Sonali and Kurmurthi
No ratings yet
Ocial Etworking: Prepared by - Stewart, Ankit, Bhavesh, Sonali and Kurmurthi
12 pages