Eigenvector Centrality and HITS Algorithm
Eigenvector Centrality and HITS Algorithm
Centrality and
Hyperlink
Induced Topic
Search (HITS)
Eigenvector Centrality : Revisited
❑The eigen vector centrality 𝑥𝑣 of a node 𝑣 in a network 𝐺 𝑉, 𝐸 is given by
1 1
𝑥𝑣 = 𝑥𝑡 = (𝑎𝑣𝑡 × 𝑥𝑡 )
λ λ
𝑡∈𝑁(𝑣) 𝑡∈𝑉
where λ is the largest eigen value of the matrix 𝐴 = 𝑎𝑖𝑗 , the adjacency matrix of the network 𝐺
❑𝑋 above is a column vector, whose 𝑣 𝑡ℎ entry is 𝑥𝑣 , the eigen vector centrality of the node 𝑣
Hyperlink-Induced Topic Search
(HITS)
❑Based on the concept of Hub nodes and Authority nodes.
Hyperlink-Induced Topic Search
(HITS)
❑In response to a query, instead of an ordered list of pages each meeting the
query, find two sets of inter-related pages:
❑Thus, a good hub page for a topic points to many authoritative pages for that
topic.
❑A good authority page for a topic is pointed to by many good hubs for that topic.
AT&T
Alice
ITIM
Hubs Authorities
Bob
O2
❑ Find all pages that are linked to or linked from pages in the root set
❑ Finally, compute hubs and authorities for the base set (which we’ll view as a
small web graph)
How to compute hub and authority
scores…
Base set
How to compute hub and
authority scores…
▪ Given a broad search query, q, HITS collects a set of pages as
follows:
▪ It sends the query q to a search engine.
▪ It then collects t (t = 200 is used in the HITS paper) highest ranked
pages. This set is called the root set W.
▪ It then grows W by including any page pointed to by a page in W and
any page that points to a page in W. This gives a larger set S, base
set.
How to compute hub and
authority scores…
▪ HITS works on the pages in S, and assigns every page in S an
authority score and a hub score.
▪ Let the number of pages in S be n.
▪ We use G = (V, E) to denote the hyperlink graph of S.
▪ We use L to denote the adjacency matrix of the graph.
How to compute hub and
authority scores…
Let the authority score of the page i be a(i), and the hub score of page i be
h(i).
The mutual reinforcing relationship of the two scores is represented as
follows:
a(i) = h( j )
( j ,i )E
h(i) = a( j )
( i , j )E
How to compute hub and
authority scores…
We use a to denote the column vector with all the authority
scores,
a = [a(1), a(2), …, a(n)]T, and
use h to denote the column vector with all the hub scores,
h = [h(1), h(2), …, h(n)]T,
Then,
a = LTh
h = La
How to compute hub and
authority scores…
▪ The computation of authority scores and hub scores is the same as the computation of the
PageRank scores, using power iteration.
▪ If we use ak and hk to denote authority and hub vectors at the kth iteration, the iterations for
generating the final solutions are
How to compute hub and authority
scores…
Example:
Example:
Exercise: Compute Hub and
Authority for the below graph
Co-citation and Bibliographic
Coupling
Another area of research concerned with links is citation analysis of scholarly
publications.
◦ A scholarly publication cites related prior work to acknowledge the origins of some ideas
and to compare the new proposal with existing work.
The more papers they are cited by, the stronger their relationship is.
Bibliographic coupling
Bibliographic coupling operates on a similar principle.
Bibliographic coupling links papers that cite the same articles
◦ if papers i and j both cite paper k, they may be related.
The more papers they both cite, the stronger their similarity is.
Relationships with co-citation
and bibliographic coupling
Co-citation of pages i and j, denoted by Cij, is
n
Cij =
k =1
Lki Lkj = ( LT L)ij
❑Can be used to compute centrality in directed networks such as citation networks and the World
Wide Web
❑Computes the relative influence of a node in a network by considering all immediate neighbors
and all further nodes connected to the node