Page Rank Algorithm
Page Rank Algorithm
CS-411
2
PAGE RANK ALGORITHM
11/12/24
The PageRank algorithm or Google algorithm was introduced by Lary Page, one of the founders of
Google.
It was first used to rank web pages in the Google search engine.
Nowadays, it is more and more used in many different fields, for example in ranking users in social
media etc…
What is fascinating with the PageRank algorithm is how to start from a complex problem and end up
3
PAGE RANK ALGORITHM
11/12/24
The web can be represented like a directed
graph where nodes represent the web
pages and edges form links between them.
Typically, if a node (web page) i is linked to
a node j, it means that i refers to j.
4
PAGE RANK ALGORITHM
We have to define what is the importance of a web page.
As a first approach,
11/12/24
it is the total number of web pages that refer to it.
If we stop to this criteria, the importance of these web pages that refer to it is not taken into account.
In other words, an important web page and a less important one has the same weight.
Another approach is
a web page spread its importance equally to all web pages it links to.
By doing that, we can then define the score of a node j as follows:
5
PAGE RANK ALGORITHM
11/12/24
From the graph, we can write this linear system:
6
PAGE RANK ALGORITHM
11/12/24
But this solution is limited for small graphs.
Indeed, as this kind of graphs are sparse and Gauss elimination
7
PAGE RANK ALGORITHM
11/12/24
Markov Chain and PageRank
Graph can be seen as a Markov chain
with the following transition matrix:
8
PAGE RANK ALGORITHM
11/12/24
P transpose is row stochastic which is a condition to apply Markov chain theorems.
For the initial distribution, let’s consider that it is equal to :
nodes.
9
PAGE RANK ALGORITHM
At every step, the random walker will jump to another node according to the transition matrix. the
probability distribution is then computed for every step.
11/12/24
This distribution tells us where the random walker is likely to be after a certain number of steps.
10
PAGE RANK ALGORITHM
11/12/24
All we have to do is solving this equation:
11
PAGE RANK ALGORITHM
11/12/24
Frobenius-Perron theorem:
If a matrix A is a square and positive matrix (all its entries are positive),
then it has a positive eigenvalue r, such as |λ| < r, where λ is an
eigenvalue of A. The eigenvector v of A with eigenvalue r is positive and is
the unique positive eigenvector.
To compute π, we use the power method iteration which is an iterative method to compute the
dominant eigenvector of a given matrix A.
12
PAGE RANK ALGORITHM
Teleportation and Damping Factor
In the web graph, we can find a web page i which refers
11/12/24
only to web page j and j refers only to i. This is what we
call spider trap problem.
We can also find a web page which has no outlink. It is
13
PAGE RANK ALGORITHM
11/12/24
Spider Trap, when the random walker reaches the node 1 in the above example, he can
only jump to node 2 and from node 2, he can only reach node 1, and so on. The
importance of all other nodes will be taken by nodes 1 and 2. In the above example, the
probability distribution will converge to π = (0, 0.5, 0.5, 0). This is not the desired result.
Dead Ends, when the walker arrives at node 2, it can’t reach any other node because it
14
PAGE RANK ALGORITHM
Teleportation
11/12/24
Teleportation consists of connecting each node of the graph to all other nodes.
The graph will be then complete.
The idea is with a certain probability β, the random walker will jump to another node according
to the transition matrix P and with a probability (1-β)/n, it will jump randomly to any node in the
graph. We get then the new transition matrix R:
where v is a vector of ones, and e a vector of 1/n. β is commonly defined as the damping factor.
15
PAGE RANK ALGORITHM
TELEPORTATION
By applying teleportation in our example, we get the following new transition matrix:
11/12/24
The matrix R has the same properties than P which means that it admits a stationary distribution,
so we can use all the theorems we saw previously.
16
PAGE RANK ALGORITHM
11/12/24
17
Thank You!!
Dr. Bharat Singh
Email id—
[email protected]
Mobile No–
8707223885