0% found this document useful (0 votes)

18 views68 pages

9 Large Network

Uploaded by

Bogdan Carauleanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views68 pages

9 Large Network

Uploaded by

Bogdan Carauleanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Study of Internet

Study of Large Networks

6CCS3INS Internet Systems

2014-15 Toktam Mahmoodi, Department of Informatics, KCL
Outline

 Study of Networks
 Measurement on a real network
 Random Graphs
 Small World phenomena
 Power Law Distribution and Preferential attachment
 Information flow and epidemics
Large networks
Study of Networks

 Empirical: Study network data to find organisational principles

 How do we measure and quantify networks?

 Mathematical models: Graph theory, statistical models

 Models allow us to understand behaviors and distinguish surprising from
expected phenomena

 Algorithms for analysing graphs

 Hard computational challenges

 Historical study of networks with mathematical graph theory

 One of the pillars of discrete mathematics
 Started with Euler’s celebrated 1735 solution of the Königsberg bridge
problem.
Properties of Network

 While a small network can be visualised directly by its

graph (N, g ), larger networks can be more difficult to
envision and describe.

 Therefore, we define a set of summary statistics or

quantitative performance measures to describe and
compare networks (focus on undirected graphs):

 Degree distributions
 Distance
 Diameter and average path length
 Clustering Coefficient
Properties of Network: Degree distribution

 Degree distribution, P(k)

 Probability that a randomly chosen node has degree k
 N(k) = no. of nodes with degree k
 P(k) = N(k) /n

 Distance
 between a pair of nodes is defined as the number of edges along
the shortest path connecting the nodes.
 In directed graphs
paths need to follow the direction of the arrows
Properties of Network: Diameter

 Let h(i, j) denote the length of the shortest path

between node i and j (or the distance between i and j).
 The diameter of a network is the largest distance between any
two nodes in the network:
diameter = max h(i, j)
 The average path length is the average distance
between any two nodes in the network
average path length = (∑i≥j h(i,j)) / (.5n(n-1))

 Average path length is bounded from above by the diameter; in

some cases, it can be much shorter than the diameter.
Properties of Network: Clustering Coefficient

 What portion of i’s neighbors are connected

 Node i with degree ki

 where ei is the number of edges between the neighbors of node i

 Average clustering
Measurement on a real network

 MSN Messenger activity in June 2006:

 150Gb/day (compressed)
 4.5Tb / month
 245 million users logged in
 180 million users engaged in conversations
 More than 30 billion conversations
 More than 255 billion exchanged messages
MSN Network: Degree Distribution (log-log plot)
MSN Network: Clustering

Avg.
clustering of
the MSN:
C = 0.1140
MSN Network: Diameter

 Avg. path length

6.6

 90% of the people

can be reached in
< 8 hops
Random Graphs

 We use the notation Gnp to denote the undirected Erdös-

Renyi graph - a Simple Random Graph Model.
 Undirected graph with n nodes
 Each edge (u, v) is formed with probability p ∈ (0, 1)
independently of every other edge (i.i.d.).
 n and p don’t uniquely determine the graph.
 We can have many different realisation for the same n, and p.
How likely is a graph on E edges

 The probability that a given Gnp, produces a graph of

exactly E edges, P(E).

 P(E) is the Binomial distribution

 Number of successes in a sequence
of n independent binary (yes/no)
experiment.
MSN Vs Random Graphs

 Degree distribution

 Clustering Coefficient

 Connected component 99% almost there

Real Network Vs Random Graphs

 Are real networks random graphs?

 The answer is simply NO!

 If Gnp is wrong, why did we spend time on it?

 It is the reference model for our analysis.
 It will help us calculate many quantities, that can then be
compared to the real data.
 It will help us understand to what degree is a particular property
the result of some random process.

 While Gnp is WRONG, it will turns out to be extremely

useful.
Small World phenomena

 Origins of a small-world idea: the Bacon number

 Create a network of Hollywood actors
 Connect two actors if they co-appeared in the movie.

 Bacon number: number of steps to Kevin Bacon

 As of Dec 2007, the highest (finite) Bacon number
reported is 8
 Only approx. 12% of all actors cannot be linked to
Bacon

 A recent study has shown, in fact Christopher

Lee is the actual center of the movie universe.
Small World Properties

 Small diameter
 High clustering
Small World phenomena

 Taking a connected graph and adding a very small

number of edges randomly, the diameter tends to drop
drastically.
 This is known as the small world phenomenon.

 Short-term memory uses small world networks between

neurons to remember this sentence.

 In modern mathematics, the center of the network of co-

authorship is considered to be P. Erdős,
 resulting in the so-called Erdős number.
 Erdős numbers are small!
Small World Experiment

 What is the typical shortest path length between any two

people?
 Experiment on the global friendship network
 Can’t measure, need to probe explicitly

 Small-world experiment [Milgram ’67]

 Picked 300 people in Omaha, Nebraska and Wichita, Kansas
 Ask them to send a letter to a stock-broker in Boston by passing
it to somebody they know and they think she/he could be related
to the broker.
 How many steps did it take?
Small World Experiment: 6 degrees of separation

 64 letters reached the target

 It took 6.2 steps on average

 Further observations:
 People who owned stock had shortest paths to the stockbroker
than random people
 People from the Boston area have even closer paths: 4.4
Criticism to Milgram Experiment

 31 of 64 chains passed through 1 of 3 people as their

final step
 Not all links/nodes are equal
 Starting points and the target were non-random
 People in the experiment follow some strategy (e.g.,
geographic routing) instead of forwarding the letter to
everyone.
 They are not finding the shortest path!
 There are not many samples (only 64)
 People might have used extra information resources
Another Small World Experiment

 In 2003 Dodds, Muhamad and Watts performed the

experiment using e-mail:
 18 targets of various backgrounds
 24,000 first steps (~1,500 per target)
 65% dropout per step
 384 chains completed (1.5% of emails reached the target)
 Average path length 4.01

After the correction, average

path length is ~ 7
Degree Distribution

 Degree distribution in a random graph

 P(k) is an exponential function of k

 Observation:
 Power Law
Degree Distribution
Node Degrees: Internet Autonomous system

[Faloutsos3,1999]
Node Degrees: Web

[Broader, et all, 2000]

Node Degrees: other networks

[Barabasi, Albert, 1999]

Power-law degree exponent

 Power-law degree exponent is typically 2 < α < 3

 Web graph: αin = 2.1, αout = 2.4
[Broder et al. 00]
 Autonomous systems: α = 2.4
[Faloutsos3, 99]
 Actor-collaborations: α = 2.3
[Barabasi-Albert 00]
 Citations to papers: α ≈ 3
[Redner 98]
 Online social networks: α ≈ 2
[Leskovec et al. 07]
Scale-Free network

 Networks with a power law tail in their degree

distribution are called “scale-free networks”.

 The name is coming from scale invariance property.

 Scale-free function:

 Power law function:

Power Laws are Everywhere
Mathematics of Power Law

 Above a certain x value, Power law is always higher

than the exponential.
Radom Vs. Scale free Network
Preferential attachment

 Nodes arrive in order 1,2,…,n

 At step j, let di be the degree of node i < j
 A new node j arrives and creates m out-links
 Prob. of j linking to a previous node i is proportional to
the degree of node i, that is di.

[Price ‘65, Albert-Barabasi ’99, Mitzenmacher ‘03]

Rich get Richer

 New nodes are more likely to link to nodes that already

have high degree

 Examples:
 Citation: New citations to a paper are proportional to the number
it already has.
Spreading through networks

 Behaviors that cascade from node to node like an

epidemic

 Examples:
 Biological:
 Diseases via contagion
 Technological:
 Cascading failures
 Spread of information
 Social:
 Rumors, news
Diffusion Model

 Probabilistic models
 Models of influence or disease spreading
 Example: You “catch” a disease with some probability from each
active neighbour in the network

 Decision based models

 Models of product adoption, decision making
 A node observes decisions of its neighbours and makes its own
decision
 Example: You watch a movie if k of your friends told you about it.
Decision Based Model of Diffusion

 Example Scenario:
 Assume a network where everyone starts chose action B
 Small set S of had chosen A
 If more than 50% of one’s friends have chosen A, one will also
change their action to A.
 threshold level for adopting A is set as, q > 1/2
Example Scenario
Example Scenario
Example Scenario
Example Scenario
Example Scenario
Network Cascade

 Consider infinite graph G

 each node has finite number of neighbours

 We say that a finite set S causes a cascade in G with

threshold q if, when S adopts A, eventually every node in G
adopts A.

 The “cascade capacity” of a graph G is the largest q for which

some finite set S can cause a cascade.

 Fact: There is no (infinite) G where cascade capacity > ½ .

 Proof idea: Suppose such G exists: q>½, finite S causes cascade.
 Show contradiction: Argue that nodes stop switching after a finite #
of steps.
Examples of infinite graphs

 Infinite Path: If q<1/2 then cascade occurs

 Infinite Tree: If q<1/3

then cascade occurs

 Infinite Grid: If q<1/4

then cascade occurs
Food for thought

 Stopping Cascade
 Let S be an initial set of adopters of A
 All nodes apply threshold q to decide whether to switch to A
 What prevents cascades from spreading?
Diffusion Model

 Probabilistic models
 Models of influence or disease spreading
 Example: You “catch” a disease with some probability from each
active neighbour in the network

 Decision based models

 Epidemic Model based on Random Trees.

 A patient meets d other people

 With probability q > 0 infects each of them
 Question is: for which values of d and q does the epidemic run
forever?
Epidemic

 Let ph = probability that there is an infected node in

depth h of the tree.
 Epidemic will die out if 0

 Recurrence for ph on tree

 result of iterating
Epidemic

 p1 =1:

 For the epidemic to die out we need f(x) to be bellow

y=x.

 𝒒⋅𝒅 = expected number of people that we infect

Spreading Models of viruses
General Epidemic Model
SIR Model
SIS Model
SIS Model
Epidemic Threshold
Epidemic Threshold in SIS model
Experiment
Independent Cascade model
Independent Cascade model
Exposure and Adaptation
Exposure Curve
Exposure Curve
Example Application
Diffusion in Viral Marketing
Small Experiment

 Gephi
 Exploratory data analysis and visualisation tool for graphs and
networks

 Available data sets to work with

 Movie Ratings: imdb data set
 http://www.imdb.com/interfaces
 Your own facebook data
 Login into Facebook account and search for Netvizz
Reading

 Networks, Crowds, and Markets: Reasoning about a Highly

Connected World
 Chapters 18 on Power Law
 Chapter 20 on Small World Phenomena
 Chapters 19 & 21 on epidemics

The Science of Social Networks
100% (4)
The Science of Social Networks
47 pages
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
No ratings yet
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
37 pages
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
No ratings yet
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
37 pages
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
116 pages
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
113 pages
WSC Week1 Graphs
No ratings yet
WSC Week1 Graphs
104 pages
05 Smallworlds
No ratings yet
05 Smallworlds
87 pages
Topic 03 - Graph Patterns
No ratings yet
Topic 03 - Graph Patterns
79 pages
05&6RandomGraphs & Homophily
No ratings yet
05&6RandomGraphs & Homophily
66 pages
Newman Networks An Introduction 2010
100% (2)
Newman Networks An Introduction 2010
394 pages
2021 - 3 - NS - Random Networks
No ratings yet
2021 - 3 - NS - Random Networks
73 pages
Class3 RandomNetworks KCA
No ratings yet
Class3 RandomNetworks KCA
76 pages
Social Network Analysis Lecture 1: Networks, Random Graphs and Metrics
No ratings yet
Social Network Analysis Lecture 1: Networks, Random Graphs and Metrics
36 pages
13 Network Models: Nadine Baumann and Sebastian Stiller
No ratings yet
13 Network Models: Nadine Baumann and Sebastian Stiller
32 pages
First Half Scribe
No ratings yet
First Half Scribe
33 pages
Small World
No ratings yet
Small World
66 pages
Gionis
No ratings yet
Gionis
191 pages
Advanced Topics in Data Mining Special Focus: Social Networks
No ratings yet
Advanced Topics in Data Mining Special Focus: Social Networks
35 pages
CSC 641 Fall2015 Module 5 Scale Free Networks
No ratings yet
CSC 641 Fall2015 Module 5 Scale Free Networks
43 pages
Module4 Networkmodels
No ratings yet
Module4 Networkmodels
68 pages
5network 01 Intro
No ratings yet
5network 01 Intro
202 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
97 pages
5y6 ScaleFreeNetworks and BAM
No ratings yet
5y6 ScaleFreeNetworks and BAM
71 pages
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
No ratings yet
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
47 pages
Unit 5 Machine
No ratings yet
Unit 5 Machine
9 pages
Course 3-4
No ratings yet
Course 3-4
46 pages
Revised Social Network Analysis - Chapter 3 - Network Growth
No ratings yet
Revised Social Network Analysis - Chapter 3 - Network Growth
36 pages
JV Small World
No ratings yet
JV Small World
74 pages
Complex Networks 複雜網絡 Lecture-1
No ratings yet
Complex Networks 複雜網絡 Lecture-1
123 pages
SMA Module2B
No ratings yet
SMA Module2B
67 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
Data Mining and BI: Social Network Analytics: Random Graphs
No ratings yet
Data Mining and BI: Social Network Analytics: Random Graphs
46 pages
Networks
No ratings yet
Networks
75 pages
Network Topology: ELEG 667-013 Spring 2003
No ratings yet
Network Topology: ELEG 667-013 Spring 2003
74 pages
Networks
No ratings yet
Networks
19 pages
Social Networks 21
No ratings yet
Social Networks 21
62 pages
Networks Sna
No ratings yet
Networks Sna
126 pages
MIT14 15JS18 Lec2-3
No ratings yet
MIT14 15JS18 Lec2-3
30 pages
Complex Network
No ratings yet
Complex Network
6 pages
Biol Sistemas 1 Redes
No ratings yet
Biol Sistemas 1 Redes
70 pages
2 Introduction To Network Analysis-Network Structure - Jasmine Mondolo
No ratings yet
2 Introduction To Network Analysis-Network Structure - Jasmine Mondolo
29 pages
Dav LVC-2
No ratings yet
Dav LVC-2
34 pages
Nonlinear Barab#asi-Albert Network: Roberto N. Onody, Paulo A. de Castro
No ratings yet
Nonlinear Barab#asi-Albert Network: Roberto N. Onody, Paulo A. de Castro
12 pages
Network Dynamics 2013
No ratings yet
Network Dynamics 2013
93 pages
03-Basic Concepts of Network Analysis
No ratings yet
03-Basic Concepts of Network Analysis
40 pages
Complex Networks
No ratings yet
Complex Networks
145 pages
Social Network Analysis
No ratings yet
Social Network Analysis
38 pages
Transmission Control Protocol: 6CCS3INS Internet Systems Toktam Mahmoodi, Department of Informatics, KCL
No ratings yet
Transmission Control Protocol: 6CCS3INS Internet Systems Toktam Mahmoodi, Department of Informatics, KCL
48 pages
Introduction To Complex Networks: Flavia Bonomo
No ratings yet
Introduction To Complex Networks: Flavia Bonomo
38 pages
Intermediate Data Science NX
No ratings yet
Intermediate Data Science NX
48 pages
2 Networks in Economics and Finance PDF
No ratings yet
2 Networks in Economics and Finance PDF
24 pages
Basics of Network Analysis
No ratings yet
Basics of Network Analysis
38 pages
Architecture
No ratings yet
Architecture
59 pages
Internet Paradigm Shift
No ratings yet
Internet Paradigm Shift
48 pages
Internet Systems KCL Course
No ratings yet
Internet Systems KCL Course
53 pages
Network Models II: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Network Models II: CS109/Stat121/AC209/E-109 Data Science
19 pages
Complex Nets 1
No ratings yet
Complex Nets 1
64 pages
CN 2016 Lecture1
No ratings yet
CN 2016 Lecture1
46 pages
Network Science
No ratings yet
Network Science
11 pages
Lesson 1
No ratings yet
Lesson 1
50 pages
Compound TCP
No ratings yet
Compound TCP
12 pages
The Structure and Function of Complex Networks
No ratings yet
The Structure and Function of Complex Networks
90 pages
L 4 Extra
No ratings yet
L 4 Extra
8 pages
L 3 Extra
No ratings yet
L 3 Extra
8 pages
L 1 Extra
No ratings yet
L 1 Extra
5 pages