0% found this document useful (0 votes)

132 views28 pages

Pajek Large Networks Paper

paper explaining network analysis software

Uploaded by

dogajunk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

132 views28 pages

Pajek Large Networks Paper

paper explaining network analysis software

Uploaded by

dogajunk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

U NIVERSITY OF L JUBLJANA

I NSTITUTE OF M ATHEMATICS , P HYSICS AND M ECHANICS

D EPARTMENT OF T HEORETICAL C OMPUTER S CIENCE
JADRANSKA 19, 1 000 L JUBLJANA , S LOVENIA

Preprint series, Vol. 41 (2003), 871

PAJEK
ANALYSIS AND VISUALIZATION
OF LARGE NETWORKS
Vladimir Batagelj, Andrej Mrvar

ISSN 1318-4865

Version: March 4, 2003

Math.Subj.Class.(2000): 05 C 90, 68 R 10, 76 M 27, 68 U 05,

05 C 50, 05 C 85, 90 C 27, 92 H 30, 92 G 30, 93 A 15.

Supported by the Ministry of Education, Science and Sport of Slovenia,

Projects J1-8532 and Z5-3350.
To be published in Graph Drawing Software book, edited by M. Jünger
and P. Mutzel, in the Springer series Mathematics and Visualization.
Address: Vladimir Batagelj, University of Ljubljana, FMF, Department
of Mathematics, and IMFM Ljubljana, Department of TCS, Jadranska
ulica 19, 1 000 Ljubljana, Slovenia
e-mail: [email protected]

Ljubljana, March 14, 2003

0
Pajek?
Analysis and Visualization of Large Networks

Vladimir Batagelj1 and Andrej Mrvar2

1
Department of Mathematics, Faculty of Mathematics and Physics, University of
Ljubljana, Slovenia
2
Faculty of Social Sciences, University of Ljubljana, Slovenia

1 Introduction

Pajek is a program, for Windows, for analysis and visu-

alization of large networks having some ten or houndred
of thousands of vertices. In Slovenian language pajek
means spider.
The design of Pajek is based on experiences gained in development of
graph data structure and algorithms libraries Graph [2] and X-graph [15],
collection of network analysis and visualization programs STRAN, RelCalc,
Draw, Energ [9], and SGML-based graph description markup language NetML
[8]. We started the development of Pajek in November 1996.
The main goals in the design of Pajek are [10,13]:
• to support abstraction by (recursive) decomposition of a large network
into several smaller networks that can be treated further using more
sophisticated methods;
• to provide the user with some powerful visualization tools;
• to implement a selection of efficient (subquadratic) algorithms for analysis
of large networks.
With Pajek we can (see Figure 1): find clusters (components, neighbour-
hoods of ‘important’ vertices, cores, etc.) in a network, extract vertices that
belong to the same clusters and show them separately, possibly with the parts
of the context (detailed local view), shrink vertices in clusters and show re-
lations among clusters (global view).
Besides ordinary (directed, undirected, mixed) networks Pajek supports
also:
• 2-mode networks, bipartite (valued) graphs – networks between two dis-
joint sets of vertices. Examples of such networks are: (authors, papers,
cites the paper), (authors, papers, is the (co)author of the paper), (peo-
ple, events, was present at), (people, institutions, is member of), (articles,
shoping lists, is on the list).
?
This work was partially supported by the Ministry of Education, Science and
Sport of Slovenia, Projects J1-8532 and Z5-3350.
2 Vladimir Batagelj and Andrej Mrvar

Fig. 1. Approaches to deal with large networks

• temporal networks, dynamic graphs – networks changing over time.

In this chapter we present the main characteristics of Pajek. Since large

networks can’t be visualized in details in a single view we have first to identify
interesting substructures in such network and then visualize them as separate
views. The central, algorithmic section of this chapter deals mainly with
different efficient approaches to this problem.

2 Applications

There exist several sources of large networks that are already in machine-
readable form. Pajek provides tools for analysis and visualization of such
networks and is applied by researchers in different areas: social network analy-
sis [11], chemistry (organic molecule), biomedical/genomics research (protein-
receptor interaction networks) [59], genealogies [57,28], Internet networks [22],
citation networks [42], diffusion networks (AIDS, news), analysis of texts [17],
data-mining (2-mode networks) [14], etc. Although it was developed primarily
for analysis of large networks it is often used also for, especially visualization
of, small networks.
In last months (end of 2002) we had over 500 downloads of Pajek per
month.
Pajek is also used at several universities: Ljubljana, Rotterdam, Stanford,
Irvine, The Ohio State University, Penn State, Wisconsin/Madison, Vienna,
Pajek, Analysis and Visualization of Large Networks 3

Freiburg, Madrid, and some others as a support in courses on network anal-

ysis. Together with Wouter de Nooy from University of Rotterdam we wrote
a course book Exploratory Social Network Analysis With Pajek[25].

3 Algorithms
To support the design goals we implemented several algorithms known from
the literature (see section 4.2), but for some tasks new, efficient algorithms,
suitable to deal with large networks, had to be developed. They mainly pro-
vide different ways to identify interesting substructures in a given network.

3.1 Citation weights

In a given set of units/vertices U (articles, books, works, etc.) we introduce
a citing relation/set of arcs R ⊆ U × U

uRv ≡ v cites u

which determines a citation network N = (U, R).

The citation network analysis started in 1964 with the paper of Garfield et
al. [29]. In 1989 Hummon and Doreian [36] proposed three indices – weights
of arcs that provide us with automatic way to identify the (most) important
part of the citation network. For two of these indices we developed algorithms
to efficiently compute them [4].
A citing relation is usually irreflexive (no loops) and (almost) acyclic.
In the following we shall assume that it has these two properties. Since in
real-life citation networks the strong components are small (usually 2 or 3
vertices) we can transform such network into an acyclic network by shrinking
strong components and deleting loops. For other approaches see [4]. It is also
useful to transform a citation network to its standardized form by adding a
common source vertex s ∈ / U and a common sink vertex t ∈ / U . The source
s is linked by an arc to all minimal elements of R; and all maximal elements
of R are linked to the sink t. Thus we get a st-digraph [TF 2.2]. Finally, to
make the theory smoother, we add also the ‘feedback’ arc (t, s).
The search path count (SPC) method is based on counters n(u, v) that
count the number of different paths from s to t through the arc (u, v). To
compute n(u, v) we introduce two auxiliary quantities: n− (v) counts the num-
ber of different paths from s to v, and n+ (v) counts the number of different
paths from v to t.
It follows by basic principles of combinatorics that

n(u, v) = n− (u) · n+ (v), (u, v) ∈ R

where
1 u=s
n− (u) =
n− (v)
P
v:vRu otherwise
4 Vladimir Batagelj and Andrej Mrvar

Fig. 2. Part of SOM main subnetwork at level 0.001

and
+ 1 u=t
n (u) = P +
v:uRv n (v) otherwise
This is the basis of an efficient algorithm for computing n(u, v) – after the
topological sort [TF 2.2] of the st-digraph we can compute, using the above
relations in topological order, the weights in time of order O(m), m = |R|.
The topological order ensures that all the quantities in the right sides of the
above equalities are already computed when needed.
The Hummon and Doreian indices are defined as follows:
• search path link count (SPLC) method: wl (u, v) equals the number of
“all possible search paths through the network emanating from an origin
node” through the arc (u, v) ∈ R.
• search path node pair (SPNP) method: wp (u, v) “accounts for all con-
nected vertex pairs along the paths through the arc (u, v) ∈ R”.
We get the SPLC weights by applying the SPC method on the network
obtained from a given standardized network by linking the source s by an arc
Pajek, Analysis and Visualization of Large Networks 5

Fig. 3. 0, 1, 2 and 3 core

to each nonminimal vertex from U ; and the SPNP weights by applying the
SPC method on the network obtained from the SPLC network by additionally
linking by an arc each nonmaximal vertex from U to the sink t.
The values of counters n(u, v) form a flow in the citation network – the
Kirchoff ’s vertex law holds: For every vertex u in a standardized citation
network incoming flow = outgoing flow :
X X
n(v, u) = n(u, v) = n− (u) · n+ (u)
v:vRu v:uRv

The weight n(t, s) equals to the total flow through network and provides a
natural normalization of weights

n(u, v)
w(u, v) = ⇒ 0 ≤ w(u, v) ≤ 1
n(t, s)

and if C is a minimal arc-cut-set

X
w(u, v) = 1
(u,v)∈C

In large networks the values of weights can grow very large. This should
be considered in the implementation of the algorithms.
In Figure 2 the main subnetwork obtained as an edge-cut at level 0.001
of the citation network (n = 4470, m = 12731) on SOM (self-organizing
maps) literature is presented. The picture is exported in SVG with addi-
tional Javascript support that provides the user with options to inspect the
subnetwork at different predetermined levels.

3.2 Cores and generalized cores

The notion of core was introduced by Seidman in 1983 [51]. Let G = (V, E)
be a graph. A subgraph H = (W, E|W ) induced by the set W is a k-core or a
core of order k iff ∀v ∈ W : degH (v) ≥ k, and H is a maximal subgraph with
this property. The core of maximum order is also called the main core. The
6 Vladimir Batagelj and Andrej Mrvar

E.Arkin

J.Mitchell I.Tollis
M.Bern A.Garg
L.Vismara
D.Eppstein

G.diBattista
M.Goodrich
R.Tamassia

G.Liotta
D.Dobkin S.Suri J.O’Rourke
J.Vitter

J.Hershberger

B.Chazelle F.Preparata
R.Seidel B.Aronov L.Guibas J.Snoeyink
H.Edelsbrunner
M.Sharir P.Agarwal
R.Pollack
J.Pach D.Halperin P.Gupta

M.Smid R.Janardan
E.Welzl M.Overmars P.Bose
J.Boissonnat
M.vanKreveld O.Devillers
J.Matousek J.Majhi
M.Yvinec
C.Yap M.deBerg
J.Schwerdt
O.Schwarzkopf G.Toussaint M.Teillaud

J.Czyzowicz
J.Urrutia

C.Icking

R.Klein

Fig. 4. pS -core at level 46 of Geomlib network

core number of vertex v is the highest order of a core that contains this vertex.
The degree deg(v) can be: in-degree, out-degree, in-degree + out-degree, etc.,
determining different types of cores.
In Figure 3 an example of cores decomposition of a given graph is pre-
sented. From this figure we can see the following properties of cores:
• The cores are nested: i < j =⇒ Hj ⊆ Hi
• Cores are not necessarily connected subgraphs.
Our algorithm for determining the cores hierarchy is based on the follow-
ing property [16]:
If from a given graph G = (V, E) we recursively delete all vertices,
and edges incident with them, of degree less than k, the remaining
graph is the k-core.
Its outline is given in Algorithm 1. In the refinements of the algorithm we
have to provide efficient implementations of sorting the degrees and their
reordering. Since the values of degrees are in the range 0..n − 1 we can order
them in O(n) using a variant of bin sort; and the update of the ordering can
be done in a constant time. For details see [18].
The cores, because they can be determined very efficiently, are one among
few concepts that provide us with meaningful decompositions of large net-
works. We expect that different approaches to the analysis of large networks
Pajek, Analysis and Visualization of Large Networks 7

Algorithm 1: Core Numbers Algorithm

Input : Graph G = (V, E) represented by lists of neighbors
Output : Table core[V ] with core number for each vertex
Compute the degrees of vertices
Order the set of vertices V in increasing order of their degrees
for each v ∈ V in the order do
Set core[v] = degree[v]
for each u ∈ adj(v) do
if degree[u] > degree[v] then
Set degree[u] = degree[u] − 1
Reorder V accordingly
end
end
end

can be built on this basis. For example: we get the following bound on the
chromatic number of a given graph G
χ(G) ≤ 1 + core(G)
Cores can also be used to localize the search for interesting subnetworks in
large networks since: if it exists, a k-component is contained in a k-core; and
a k-clique is contained in a k-core.
The notion of core can be generalized to networks. Let N = (V, E, w) be a
network, where G = (V, E) is a graph and w : E → IR is a function assigning
values to edges. A vertex property function on N, or a p-function for short, is
a function p(v, U ), v ∈ V , U ⊆ V with real values. Let adjU (v) = adj(v) ∩ U .
Besides degrees, here are some examples of p-functions:
X
pS (v, U ) = w(v, u), where w : E → IR+ 0
u∈adjU (v)

pM (v, U ) = max w(v, u), where w : E → IR

u∈adjU (v)

pk (v, U ) = number of cycles of length k through vertex v in (U, E|U )

The subgraph H = (C, E|C) induced by the set C ⊆ V is a p-core at level
t ∈ IR iff ∀v ∈ C : t ≤ p(v, C) and C is a maximal such set.
The function p is monotone iff it has the property
C1 ⊂ C2 ⇒ ∀v ∈ V : (p(v, C1 ) ≤ p(v, C2 ))
The degrees and the functions pS , pM and pk are monotone. For a monotone
function the p-core at level t can be determined, as in the ordinary case, by
successively deleting vertices with value of p lower than t; and the cores on
different levels are nested
t 1 < t 2 ⇒ H t2 ⊆ H t1
8 Vladimir Batagelj and Andrej Mrvar

Michael/Zrieva/ Junius/Georgio/ Nicola/Ragnina/ Marinus/Zrieva/

Francischa/Georgio/ Anucla/Zrieva/ Nicoleta/Zrieva/ Maria/Ragnina/

Junius/Zrieva/ Lorenzo/Ragnina/
Damianus/Georgio/ Margarita/Bona/
Legnussa/Babalio/ Slavussa/Mence/

Nicolinus/Gondola/
Franussa/Bona/ Sarachin/Bona/
Nicoletta/Gondola/

Marin/Gondola/ Marinus/Bona/
Magdalena/Grede/ Phylippa/Mence/

Fig. 5. Marriages among relatives in Ragusa

The p-function is local iff

p(v, U ) = p(v, adjU (v))
The degrees, pS and pM are local; but pk is not local for k ≥ 4. For a local
p-function an O(m max(∆, log n)) algorithm for determining the p-core levels
exists, assuming that p(v, adjC (v)) can be computed in O(degC (v)) [19].
In Figure 4 a pS -core at level 46 of the collaboration network in the field
of computational geometry [37] is presented.

3.3 Pattern searching

If a selected pattern determined by a given graph does not occur frequently
in a sparse network the straightforward backtracking algorithm applied for
pattern searching finds all appearences of the pattern very fast even in the
case of very large networks.
To speed up the search or to consider some additional properties of the
pattern, a user can set some additional options:
• vertices in network should match with vertices in pattern in some nomi-
nal, ordinal or numerical property (for example, type of atom in molec-
ula);
• values of edges must match (for example, edges representing male/female
links in the case of p-graphs [57]);
• the first vertex in the pattern can be selected only from a given subset of
vertices in the network.
Pattern searching was successfully applied to searching for patterns of atoms
in molecula (carbon rings) and searching for relinking marriages in genealo-
gies. Figure 5 presents three connected relinking marriages which are non-
blood marriages found in the genealogy of ragusan noble families [28]. The
Pajek, Analysis and Visualization of Large Networks 9

1 - 003 2 - 012 3 - 102 4 - 021D

5 - 021U 6 - 021C 7 - 111D 8 - 111U

9 - 030T 10 - 030C 11 - 201 12 - 120D

13 - 120U 14 - 120C 15 - 210 16 - 300

Fig. 6. Triads

genealogy is represented as a p-graph. A solid arc indicates the is a son of

relation, and a dotted arc indicates the is a daughter of relation. In all
three patterns a brother and a sister from one family found their partners in
the same other family.

3.4 Triads
Let G = (V, R) be a simple directed graph without loops. A triad is a sub-
graph induced by a given set of three vertices. There are 16 nonisomorphic
(types of) triads [55, page 244]. They can be partitioned into three basic
types (see Figure 6):
• the null triad 003;
• dyadic triads 012 and 102; and
• connected triads: 111D, 201, 210, 300, 021D, 111U, 120D, 021U, 030T,
120U, 021C, 030C and 120C.
10 Vladimir Batagelj and Andrej Mrvar

Several properties of a graph can be expressed in terms of its triadic spectrum

– distribution of all its triads. It also provides ingredients for p∗ network
models [56]. A direct approach to determine the triadic spectrum is of order
O(n3 ); but in most large graphs it can be determined much faster [12]. The
algorithm is based on the folllowing observation: in a large and sparse graph
most triads are null triads. Let T1 , T2 , T3 be the number of null, dyadic and
connected triads. Since the total number of triads is T = n3 and the above

types partition the set of all triads, the idea of the algorithm is as follows:
• count all dyadic T2 and all connected T3 triads with their subtypes;
• compute the number of null triads T1 = T − T2 − T3 .
In the algorithm we have to assure that every non-null triad is counted ex-
actly once while scanning the set of arcs. A set of three vertices {v, u, w}
can be in general selected in 6 different ways (v, u, w), (v, w, u), (u, v, w),
(u, w, v), (w, v, u), (w, u, v). We solve the isomorphism problem by introduc-
ing the canonical selection that contributes to the triadic count; the other,
noncanonical selections need not to be considered in the counting process.
Every connected dyad forms a dyadic triad with every vertex both mem-
bers of the dyad are not adjacent to. Let R̂ = R ∪ R−1 . Each pair of vertices
(v, u), v < u connected by an arc contributes
n − |R̂(u) ∪ R̂(v) \ {u, v}| − 2
triads of type 3 – 102, if u and v are connected in both directions; and
of type 2 – 012 otherwise. The condition v < u determines the canonical
selection for dyadic triads. A selection (v, u, w) of connected triad is canonical
iff v < u < w.
The triads isomorphism problem can be efficiently solved by assigning to
each triad a code – an integer number between 0 to 63 obtained by treating
the out-diagonal entries of triad adjacency matrix as a binary number. Each
triad code corresponds to a unique triad type that can be determined from
a precomputed table.
For a connected triad we can always assume that v is the smallest of its
vertices. So we have to determine the canonical selection from the remaining
two selections (v, u, w) and (v, w, u). If v < w < u and v R̂w then the selection
(v, w, u) was already counted before. Therefore we have to consider it as
canonical only if it is not v R̂w.
In an implementation of the algorithm we must also take care about the
range overflow in the case of T and T1 .
The total complexity of the algorithm is O(∆m)ˆ and thus, for graphs with
ˆ ˆ
small maximum degree ∆ << n, since 2m ≤ n∆, of order O(n).

3.5 Triangular connectivities

In this subsection we present an extension of notion of connectivity to con-
nectivity by chains of triangles.
Pajek, Analysis and Visualization of Large Networks 11

WORMALD, NICHOLAS C. LASKAR, RENU C. SHELAH, SAHARON

MCKAY, BRENDAN D. HEDETNIEMI, STEPHEN T. MAGIDOR, MENACHEM

KLEITMAN, DANIEL J.

CHUNG, FAN RONG K.

GRAHAM, RONALD L.
SAKS, MICHAEL E.

ARONOV, BORIS
LINIAL, NATHAN PACH, JANOS
POLLACK, RICHARD M.
HENNING, MICHAEL A.
FRANKL, PETER SPENCER, JOEL H.
ALON, NOGA OELLERMANN, ORTRUD R.
LOVASZ, LASZLO
KOMLOS, JANOS GODDARD, WAYNE D.
FUREDI, ZOLTAN
TUZA, ZSOLT ALAVI, YOUSEF
BABAI, LASZLO
SZEMEREDI, ENDRE CHARTRAND, GARY
BOLLOBAS, BELA HARARY, FRANK
AJTAI, MIKLOS KUBICKI, GRZEGORZ
SCHWENK, ALLEN JOHN

RODL, VOJTECH
NESETRIL, JAROSLAV ROSA, ALEXANDER
GYARFAS, ANDRAS
SCHELP, RICHARD H. STINSON, DOUGLAS ROBERT
LEHEL, JENO CHEN, GUANTAO
MULLIN, RONALD C.
FAUDREE, RALPH J.
COLBOURN, CHARLES J.
JACOBSON, MICHAEL S.
PHELPS, KEVIN T.

Fig. 7. Edge-cut at level 16 of triangular network of Erdős collaboration graph

Undirected graphs
We call a triangle a subgraph isomorphic to K3 . A subgraph H = (V 0 , E 0 )
of G = (V, E) is triangular if each its vertex and each its edge belongs to at
least one triangle in H.
A sequence (T1 , T2 , . . . , Ts ) of triangles of G (vertex) triangularly connects
vertices u, v ∈ V iff u ∈ T1 and v ∈ Ts or u ∈ Ts and v ∈ T1 and V (Ti−1 ) ∩
V (Ti ) 6= ∅, i = 2, . . . s. Such sequence is called a triangular chain. It edge
triangularly connects vertices u, v ∈ V iff a stronger version of the second
condition holds E(Ti−1 ) ∩ E(Ti ) 6= ∅, i = 2, . . . s.
A pair of vertices u, v ∈ V is (vertex) triangularly connected iff u =
v, or there exists a chain that triangularly connects u and v. Triangular
connectivity is an equivalence relation on the set of vertices V ; and nontrivial
triangular connectivity components are exactly maximal connected triangular
subgraphs.
A pair of vertices u, v ∈ V is edge triangularly connected iff u = v, or
there exists a chain that edge triangularly connects u and v. Edge triangular
connectivity components determine an equivalence relation on the set of edges
E. Each nontriangular edge is in its own component.
12 Vladimir Batagelj and Andrej Mrvar

Let G be a simple undirected graph. A triangular network NT (G) =

(V, ET , w) determined by G is a subgraph GT = (V, ET ) of G which set of
edges ET consists of all triangular edges of E(G). For e ∈ ET the weight w(e)
equals to the number of different triangles in G to which e belongs.
A procedure for determining ET and w(e), e ∈ ET simply collects all edges
with w(e) = |adj(u) ∩ adj(v)| > 0, e = {u, v} ∈ E. If the sets of neighbors
adj(v) are ordered we can use merging to compute w(e) faster. Nontrivial
triangular connectivity components are exactly the components of GT .
Triangular networks can be used to efficiently identify dense clique-like
parts of a graph. If an edge e belongs to a k-clique in G then w(e) ≥ k − 2.
In Figure 7 the edge-cut at level 16 of triangular network of Erdős collab-
oration graph [34,11] (without Erdős, n = 6926, m = 11343) is presented.
Directed graphs
If the graph G is mixed we replace edges with pairs of opposite arcs. In the
following let G = (V, A) be a simple directed graph without loops. For a
selected arc (u, v) ∈ A there are four different types of directed triangles:
cyclic, transitive, input and output.

cyc tra in out

For each type we get the corresponding triangular network Ncyc , Ntra ,
Nin and Nout . Also procedures for determining the networks are similar to
undirected case. For example, for the cyclic network Ncyc = (V, Acyc , wcyc )
we have for (u, v) ∈ Acyc

wcyc (u, v) = |outadj(v) ∩ inadj(u)|

In directed graphs we distinguish weak and strong connectivity. The weak

connectivity can be reduced to the undirected concepts in the skeleton S =
(V, ES ) of the given graph G

ES = {{u, v} : u 6= v ∧ (u, v) ∈ A}

A subgraph H = (V 0 , A0 ) of G is cyclic triangular if each its vertex and

each its arc belongs to at least one cyclic triangle in H. A connected cyclic
triangular subgraph is also strongly connected.
A sequence (T1 , T2 , . . . , Ts ) of cyclic triangles of G (vertex) cyclic trian-
gularly connects vertex u ∈ V to vertex v ∈ V iff u ∈ T1 and v ∈ Ts or u ∈ Ts
and v ∈ T1 and V (Ti−1 ) ∩ V (Ti ) 6= ∅, i = 2, . . . s; such sequence is called a
cyclic triangular chain. It arc cyclic triangularly connects vertex u to vertex
Pajek, Analysis and Visualization of Large Networks 13

serial
publication
American Library Directory
transaction log
periodical
suggestion box
review frequency
charge
series
issue
library colophon
call number Library Literature

journal layout
fixed location publishing
printing
blanket order American Library Association /ALA/
title page
Books in Print /BIP/
vendor
homepage International Standard Book Number /ISBN/
entry published price
round table
dummy edition
librarian condition
catalog plate fiction
Oak Knoll
bibliographic record imprint abstract
dust jacket work
book
bibliography editor
half-title

library binding title

table of contents /TOC/
index
invoice new book text
endpaper copyright
book size parts of a book
front matter collation
publisher binding
folio

cover
page

Fig. 8. Edge-cut at level 11 of transitive network of ODLIS dictionary graph

Pajek
v iff A(Ti−1 ) ∩ A(Ti ) 6= ∅, i = 2, . . . s holds; such sequence is called an arc
cyclic triangular chain.
Again, we can introduce two types of cyclic triangular connectivity:
A pair of vertices u, v ∈ V is (vertex) cyclic triangularly connected iff
u = v, or there exists a cyclic triangular chain that connects u to v.
A pair of vertices u, v ∈ V is arc cyclic triangularly connected iff u = v,
or there exists an arc cyclic triangular chain that connects u to v.
Cyclic triangular connectivity is an equivalence relation on the set of
vertices V ; and the arc cyclic triangular connectivity components determine
an equivalence relation on the set of arcs A.
There exists also a parallel to unilateral connectivity. The vertex v ∈ V
is transitively triangularly reachable from the vertex u ∈ V iff u = v, or there
exists a walk from u to v in which each arc is transitive – is a base of some
transitive triangle.
Transitive arcs are essentially reinforced arcs. If we remove from a graph
G = (V, A) a transitive arc the reachability relation in V does not change.
In Figure 8 the edge-cut at level 11 of transitive network of ODLIS dic-
tionary graph [45] is presented.
These notions can be generalized to short cycle connectivity [20].
14 Vladimir Batagelj and Andrej Mrvar

3.6 Generating large random networks

Let p ∈ [0, 1] be a given probability. An Erdős-Rényi random graph G ∈
G(n, p) is obtained by selecting every edge {u, v} with a probability p:

Pr({u, v} ∈ G) = p

It is easy to write a program to do this:

E = ∅;
for u = 1 to n − 1 do for v = u + 1 to n do
if random < p then E = E ∪ {{u, v}};

But, for large and very sparse networks this is too slow. A faster procedure
can be built on the following idea: move by random steps over the M = n2
cells and mark the touched cells.
How to select the length of the random step? For our Bernoulli model
we have Pr(step = s) = q s−1 p, s = 1, 2, 3, . . . and F (s) = Pr(step < s) =
Ps−1 t−1
t=1 q p = 1−q s−1 . Therefore we get the random step s from the equation
F (s) = random

log(1 − random)
s = F −1 (random) = 1 + b c
log q
This is the basis of the fast random graph generation procedure presented in
Algorithm 2. The expected number of steps of this procedure is M p.

Algorithm 2: Sparse Erdős-Rényi random graph generator

Input : Probability p, Number of vertices n
Output : Random graph G = (1..n, E)
Set q = 1 − p; f = 1; u = 2; k = 0; E = ∅; M = n(n − 1)/2; again = true
while again do
ln(1 − random)
Set k = k + 1 + b c
ln q
if k > M then Set again = f alse else
while f < k do Set f = f + u; u = u + 1
Set v = k + u − f − 1; E = E ∪ {{u, v}}
end
od

The same approach is easy to adapt to generate different types of random

graphs: undirected, directed, acyclic, undirected bipartite, directed bipartite,
acyclic bipartite, 2-mode, and others [5].
Pajek contains also a refinement of the model for generating scale free
network s, proposed in [47]. At each step of the growth a new vertex and k
Pajek, Analysis and Visualization of Large Networks 15

edges are added to the network N . The endpoints of the edges are randomly
selected among all vertices according to the probability

indeg(v) outdeg(v) 1
Pr(v) = α +β +γ
|E| |E| |V |
P
where α + β + γ = 1. It is easy to check that v∈V Pr(v) = 1. The time
complexity of this procedure is O(m).

3.7 2-mode networks

A 2-mode network is a structure N = (U, V, A, w), where U and V are disjoint

sets of vertices, A is the set of arcs with the initial vertex in the set U and
the terminal vertex in the set V , and w : A → IR is a weight. If no weight is
defined we can assume a constant weight w(u, v) = 1 for all arcs (u, v) ∈ A.
The set A can be viewed also as a relation A ⊆ U × V . A 2-mode network
can be formally represented by rectangular matrix A = [auv ]U ×V .

w(u, v) (u, v) ∈ A
n
auv =
0 otherwise
For direct analysis of 2-mode networks we can use eigen-vector approach,
clustering and blockmodeling. But most often we transform a 2-mode net-
work into an ordinary (1-mode) network N1 = (U, E1 , w1 ) or/and N2 =
(V, E2 , w2 ), where E1 and w1 are determined by the matrix A(1) = AAT ,
(1) (1) (1)
auv = z∈V auz · aTzv . Evidently auv = avu . There is an edge {u, v} ∈ E1 in
P
(1)
N1 iff adj(u) ∩ adj(v) 6= ∅. Its weight is w1 (u, v) = auv . The network N2 is
determined in a similar way by the matrix A(2) = AT A. The networks N1
and N2 are analyzed using standard methods.

3.8 Normalizations

The normalization approach was developed for quick inspection of (1-mode)

networks obtained from 2-mode networks [14,60] – a kind of network based
data-mining. In networks obtained from large 2-mode networks there are
often huge differences in weights. Therefore it is not possible to compare the
vertices according to the raw data. First we have to normalize the network to
make the weights comparable. There exist several ways how to do this. Some
of them are presented in Table 1. They can be used also on other networks.
In the case of networks without loops we define the diagonal weights
for undirected networks as the sum of out-diagonal elements in the row (or
column)
X
wvv = wvu
u
16 Vladimir Batagelj and Andrej Mrvar

Fig. 9. GeoDeg normalization of Reuters terror news network

Table 1. Weight normalizations

wuv wuv
Geouv = √ GeoDeguv = p
wuu wvv degu degv
wuv wuv
Inputuv = Outputuv =
wvv wuu
wuv wuv
Minuv = Maxuv =
min(wuu , wvv ) max(wuu , wvv )
wuv n wuv
wuu ≤ wvv wuu ≤ wvv
n
MinDiruv = wuu MaxDiruv = wvv
0 otherwise 0 otherwise

and for directed networks as some mean value of the row and column sum,
for example
1 X X
wvv = ( wvu + wuv )
2 u u

Usually we assume that the network does not contain any isolated vertex.
After a selected normalization the important parts of network are ob-
tained by edge-cutting the normalized network at selected level t and pre-
serving components with at least k vertices.
Pajek, Analysis and Visualization of Large Networks 17

In Figure 9 a part of ‘themes’ from Reuters terror news network [14]

determined by a cut of its GeoDeg normalization is presented.

3.9 Blockmodeling

Pajek - shadow 0.00,1.00 Sep- 5-1998 Pajek - shadow 0.00,1.00 Sep- 5-1998
World trade - alphabetic order World Trade (Snyder and Kick, 1979) - cores
afg uki
alb net
alg bel
arg lux
aus fra
aut ita
bel den
bol jap
bra usa
brm can
bul bra
bur arg
cam ire
can swi
car spa
cha por
chd wge
chi ege
col pol
con aus
cos hun
cub cze
cyp yug
cze gre
dah bul
den rum
dom usr
ecu fin
ege swe
egy nor
els irn
eth tur
fin irq
fra egy
gab leb
gha cha
gre ind
gua pak
gui aut
hai cub
hon mex
hun uru
ice nig
ind ken
ins saf
ire mor
irn sud
irq syr
isr isr
ita sau
ivo kuw
jam sri
jap tha
jor mla
ken gua
kmr hon
kod els
kor nic
kuw cos
lao pan
leb col
lib ven
liy ecu
lux per
maa chi
mat tai
mex kor
mla vnr
mli phi
mon ins
mor nze
nau mli
nep sen
net nir
nic ivo
nig upv
nir gha
nor cam
nze gab
pak maa
pan alg
par hai
per dom
phi jam
pol tri
por bol
rum par
rwa mat
saf alb
sau cyp
sen ice
sie dah
som nau
spa gui
sri lib
sud sie
swe tog
swi car
syr chd
tai con
tha zai
tog uga
tri bur
tun rwa
tur som
uga eth
uki tun
upv liy
uru jor
usa yem
usr afg
ven mon
vnd kod
vnr brm
wge nep
yem kmr
yug lao
zai vnd
dom

maa

mon

maa

dom

mon
cam

mex

som

yem

mex

cam

som

yem
wge

wge
kuw

swe

kuw
brm

mor

rum

mor

brm
dah
den

ege

gab
gha

gua

hon
hun

kmr

mat

nau
nep

pan

uga

den

ege

hun

gua
hon

pan

gha

gab

mat

dah
nau

uga

nep
kmr
aus

can

cha
chd

con

cub

ecu

egy

jam

ken

kod

mla

nze
pak

rwa

sau
sen

spa

sud

upv

usa

ven
vnd

yug

usa
can

spa

aus

yug

egy

cha

pak

cub

ken

sud

mla

ven
ecu

nze

sen

upv

jam

chd
con

rwa

kod

vnd
cos

cyp
cze

cze

cos

cyp
arg

bra

bur

gre

nor

par
per

por

swi

uru

bra
arg

swi

por

gre

nor

uru

per

par

bur
afg

aut

car

eth

kor

net

tha
tog

tun

usr

vnr

net

usr

aut

tha

kor
vnr

tog
car

eth
tun

afg
alb
alg

bel
bol

bul

gui
hai

ind

jap

lao
leb

nig

phi
pol

saf

syr

bel

jap

pol

bul

leb

ind

nig

saf

syr

phi

alg
hai

bol

alb

gui

lao
chi
col

els

ice

ins

ivo

lux

mli

nic

sie

uki

zai

uki

lux

els
nic

col

chi

ins

mli

ivo

ice

sie

zai
fra

tur

fra

tur
ire
irn
irq

jor

nir

ire

irn

irq

nir

jor
fin

isr
ita

sri

tai

ita

fin

isr

sri

tai
lib

lib
liy

liy
tri

tri
Fig. 10. Orderings

In Figure 10 the Snyder and Kick’s world trade network is presented by

its matrix: on the left side the units (states) are ordered in the alphabetic
order of their names; on the right side they are ordered on the basis of clus-
tering results. It is evident that a ‘proper’ ordering can reveal a structure in
the network. Such orderings can be produced in different ways [44]. On the
networks of moderate size (up to some hundreds of units) we can use also the
blockmodeling methods.
The goal of blockmodeling is to reduce a large, potentially incoherent net-
work to a smaller comprehensible structure that can be interpreted more
readily [6,3,7]. One of the main procedural goals of blockmodeling is to iden-
tify, in a given network N = (U, R), R ⊆ U × U , clusters (classes) of units/
vertices that share structural characteristics defined in terms of R. The units
within a cluster have the same or similar connection patterns to other units.
They form a clustering C = {C1 , C2 , . . . , Ck } which is a partition of the set
U . Each partition determines an equivalence relation (and vice versa).
A clustering C partitions also the relation R into block s

R(Ci , Cj ) = R ∩ Ci × Cj

Each such block consists of units belonging to clusters Ci and Cj and all arcs
leading from cluster Ci to cluster Cj . If i = j, a block R(Ci , Ci ) is called a
diagonal block.
18 Vladimir Batagelj and Andrej Mrvar

Fig. 11. Blockmodeling

A blockmodel consists of structures obtained by identifying all units from

the same cluster of the clustering C. For an exact definition of a blockmodel
we have to be precise also about which blocks produce an arc in the reduced
graph and which do not, and of what type. Some types of connections are
presented in Figure 12. The reduced graph can be represented by relational
matrix, called also image matrix.
Also, by reordering of network matrix so that the units from each cluster of
the optimal clustering are located together we obtain a matrix representation
of the network with visible structure.
How to determine an appropriate blockmodel? The blockmodeling can be
formulated as a clustering problem (Φ, P ) as follows:

Determine the clustering C? ∈ Φ for which

P (C? ) = min P (C)

C∈Φ

Since the set of units U is finite, the set of feasible clusterings Φ is also finite.
Therefore the set Min(Φ, P ) of all solutions of the problem (optimal cluster-
ings) is not empty. In theory, the set Min(Φ, P ) can be determined by the
complete search – but it turns out that most cases of the clustering problem
are N P hard. The blockmodeling problems are usually solved using local
optimization methods based on moving a unit from one cluster to another or
interchanging two units between two clusters.
One of the possible ways of constructing a criterion function that directly
reflects the considered equivalence is to measure the fit of a clustering to
Pajek, Analysis and Visualization of Large Networks 19

Fig. 12. Block Types

an ideal one with perfect relations within each cluster and between clusters
according to the considered equivalence.
Given a clustering C = {C1 , C2 , . . . , Ck }, let B(Cu , Cv ) denote the set of
all ideal blocks corresponding to block R(Cu , Cv ). Then the global error of
clustering C can be expressed as
X
P (C) = min d(R(Cu , Cv ), B)
B∈B(Cu ,Cv )
Cu ,Cv ∈C

where the term d(R(Cu , Cv ), B) measures the difference (error) between the
block R(Cu , Cv ) and the ideal block B. d is constructed on the basis of
characterizations of types of blocks. The function d has to be compatible
with the selected type of equivalence. Determining the block error, we also
determine the type of the best fitting ideal block (the types are ordered).
The criterion function P (C) is sensitive iff P (C) = 0 ⇔ C determines an
exact blockmodeling. For all presented block types sensitive criterion func-
tions can be constructed. Once a clustering C and types of blocks are de-
termined, we can also compute the values of connections by using averaging
rules.
In Figure 13 a symmetric acyclic (edge connected inside clusters, acyclic
reduced graph) blockmodel [27] of Student Government at the University of
Ljubljana [35] is presented. The obtained clustering in 4 clusters is almost
exact. The only error is produced by the arc (a3, m5).
20 Vladimir Batagelj and Andrej Mrvar

Fig. 13. A Symmetric Acyclic Blockmodel of Student Government

4 Implementation
4.1 Data structures
In Pajek analysis and visualization are performed using 6 data types:
• network (graph),
• partition (nominal or ordinal properties of vertices),
• vector (numerical properties of vertices),
• cluster (subset of vertices),
• permutation (reordering of vertices, ordinal properties), and
• hierarchy (general tree structure on vertices).
In the near future we intend to extend this list with a support of multiple
networks and partitions of edges.
The power of Pajek is based on several transformations that support
different transitions among these data structures. Also the menu structure
(see Figure 14) of the main Pajek’s window is based on them. Pajek’s main
window uses a ‘calculator’ paradigm with list-accumulator for each data type.
The operations are performed on the currently active (selected) data and are
also returning the results through accumulators.
The values of vectors can be used to determine several elements of network
display such as: X, Y, Z coordinates and the size of the vertex shape. The
partition can be graphically represented by the color and shape of vertices.
Also the values of edges can be represented by the thickness and/or color.
Pajek, Analysis and Visualization of Large Networks 21

Fig. 14. Pajek’s Main Window

4.2 Implemented algorithms

In Pajek, besides the algorithms described in section 3, several known efficient
algorithms are implemented, like:
• simplifications and transformations: deleting loops, multiple edges, trans-
forming arcs to edges etc.;
• components: strong, weak, biconnected, symmetric;
• decompositions: symmetric-acyclic, hierarchical clustering;
• paths: shortest path(s), all paths between two vertices;
• flows: maximum flow between two vertices;
• neighborhood : k-neighbours;
• CPM – critical paths;
• social networks algorithms: centrality measures, hubs and authorities,
measures of prestige, brokerage roles, structural holes, diffusion parti-
tions;
• measures of dependencies among partitions / vectors: Cramer’s V, Spear-
man rank correlation coefficient, Pearson correlation coefficient, Rajski
coefficient;
• extracting subnetwork;
• shrinking clusters in network (generalized blockmodeling);
• reordering: topological ordering, Richards’s numbering, Murtagh’s seri-
ation and clumping algorithms, depth/breadth first search;
Pajek contains also some data analysis procedures which have higher order
time complexities and can be therefore used only on smaller networks, or se-
lected parts of large networks: hierarchical clustering, generalized blockmod-
eling, partitioning signed graphs [26], TSP (Traveling Salesman Problem),
computing geodesics matrices, etc.
The procedures are available through the main window menus. Frequently
used sequences of operations can be defined as macros. This allows also the
adaptations of Pajek to groups of users from different areas (social networks,
chemistry, genealogy, computer science, mathematics. . . ) for specific tasks.
22 Vladimir Batagelj and Andrej Mrvar

4.3 Layout Algorithms and Layout Features

Special emphasis is given in Pajek to automatic generation of network lay-

outs. Several standard algorithms for automatic graph drawing are imple-
mented: spring embedders (Kamada-Kawai and Fruchterman-Reingold), lay-
outs determined by eigenvectors (Lanczos algorithm), drawing in layers (ge-
nealogies and other acyclic structures), fish-eye views and block (matrix)
representation.
These algorithms were modified and extended to enable additional op-
tions: drawing with constraints (optimization of the selected part of the net-
work, fixing some vertices to predefined positions, using values of edges as
similarities or dissimilarities), drawing in 3D space. Pajek also provides tools
for manual editing of graph layout.
Properties of vertices/edges (given as data or computed) can be repre-
sented using colors, sizes and/or shapes of vertices/edges.
Pajek supports also drawing sequences of networks in its Draw window,
and exports sequences of networks in suitable formats that can be examined
with special 2D or 3D viewers (e.g., SVG and Mage). Pictures in SVG can
be further controled using support written in Javascript.

4.4 Interfaces

Pajek supports also some non-native input formats: UCINET DL files [53];
Vega graph files [54]; chemical MDLMOL [41] and BS; and genealogical GED-
COM [30].
The layouts can be exported in the following output graphic formats that
can be examined by special 2D and 3D viewers: Encapsulated PostScript
(EPS) [31], Scalable Vector Graphics (SVG) [1], VRML [24], MDLMOL/
chime [41], and Kinemages (Mage) [49].
The main window menu Tools provides export of Pajek’s data to statisti-
cal program R [48,21]. In the Tools menu, the user can prepare calls to her/his
favorite viewers and other tools. It is also possible to run Pajek (+macros)
from other programs (R, Ucinet, and others).

5 Examples

Several examples of applications of Pajek were already presented as illustra-

tions while describing selected algorithms.
In Figure 15 a 3D layout of a graph obtained using eigenvector s is pre-
sented.
In Figure 16 a snapshoot of 3D layout displayed in a VRML viewer of our
drawing of graph A from the Graph drawing contest 1997 is presented [33].
Pajek, Analysis and Visualization of Large Networks 23

Fig. 15. 3D layout obtained using eigenvectors

6 Software

6.1 Architecture

Pajek is implemented in Delphi and runs on Windows operating systems.

On the things to do list we have: support for GraphML format, implement-
ing Pajek on Unix, and replacing macros by a Javascript(?) based network
scripting language.

6.2 Availability

Pajek is still under development. The latest version is freely available, for
noncommercial use, at its home page:
http://vlado.fmf.uni-lj.si/pub/networks/pajek/
24 Vladimir Batagelj and Andrej Mrvar

Fig. 16. GD’97 contest graph A in VRML

References

1. Adobe SVG Viewer (2002) http://www.adobe.com/svg/viewer/install/

2. Batagelj V. (1986) Graph – data structure and algorithms in pascal. Research
report.
3. Batagelj, V. (1997) Notes on blockmodeling. Social Networks 19, 143-155.
4. Batagelj V. (2002) Efficient Algorithms for Citation Network Analysis
5. Batagelj V., Brandes U. (2002) Fast generation of large sparse random graphs.
in preparation.
6. Batagelj, V., Doreian, P., and Ferligoj, A. (1992) An Optimizational Approach
to Regular Equivalence. Social Networks 14, 121-135.
7. Batagelj V., Ferligoj A. (2000) Clustering relational data. Data Analysis (ed.:
W. Gaul, O. Opitz, M. Schader), Springer, Berlin, 3-15.
8. Batagelj V., Mrvar A. (1995) Towards NetML Networks Markup Language.
Presented at International Social Network Conference, London, July 6-10, 1995.
http://www.ijp.si/ftp/pub/preprints/ps/95/trp9515.ps
9. Batagelj V., Mrvar A. (1991-94) Programs for Network Analysis.
http://vlado.fmf.uni-lj.si/pub/networks/
10. Batagelj V., Mrvar A. (1998) Pajek – A Program for Large Network Analysis.
Connections, 21 (2), 47-57
11. Batagelj V., Mrvar A. (2000) Some Analyses of Erdős Collaboration Graph.
Social Networks, 22, 173-186
12. Batagelj V., Mrvar A. (2001) A Subquadratic Triad Census Algorithm for Large
Sparse Networks with Small Maximum Degree. Social Networks, 23, 237-243
Pajek, Analysis and Visualization of Large Networks 25

13. Batagelj V., Mrvar A. (2002) Pajek - Analysis and Visualization of Large Net-
works. In: Mutzel P., Jünger M., Leipert S. (Eds.) GD’01, Vienna, Austria.
September 23-26, 2001 LNCS 2265. Springer-Verlag, 477-478.
14. Batagelj V., Mrvar A. (2002) Density based approaches to Reuters terror news
network analysis. submitted.
15. Batagelj V., Pisanski T. (1989) Xgraph project documentation.
16. Batagelj V., Mrvar A., Zaveršnik M. (1999) Partitioning Approach to Visual-
ization of Large Graphs. In: Kratochvil J. (Ed.) GD’99, Štiřin Castle, Czech
Republic. LNCS 1731. Springer-Verlag, 90-97.
17. Batagelj V., Mrvar A., Zaveršnik M. (2002) Network analysis of texts. Language
Technologies, Ljubljana, p. 143-148.
18. Batagelj V., Zaveršnik M. (2001) An O(m) Algorithm for Cores Decomposition
of Networks. Submitted.
19. Batagelj V., Zaveršnik M. (2002) Generalized Cores. Submitted.
http://arxiv.org/abs/cs.DS/0202039
20. Batagelj, V. and Zaveršnik, M. (2002) Triangular connectivity and its general-
izations, in preparation.
21. Butts, C.T. (2002) sna: Tools for Social Network Analysis.
http://cran.at.r-project.org/src/contrib/PACKAGES.html#sna
22. Caida: Internet Visualization Tool Taxonomy.
http://www.caida.org/tools/taxonomy/visualization/
23. Cormen T.H., Leiserson C.E., Rivest R.L., Stein C. (2001) Introduction to
Algorithms, Second Edition. MIT Press.
24. Cosmo Player (2002) http://ca.com/cosmo/
25. de Nooy W., Mrvar A., Batagelj V. (2002) Exploratory Social Network Analysis
With Pajek. to be published by the Cambridge University Press.
26. Doreian P., Mrvar A. (1996) A Partitioning Approach to Structural Balance.
Social Networks, 18. 149-168
27. Doreian, P., Batagelj, V., Ferligoj, A. (2000) Symmetric-acyclic decompositions
of networks. J. classif., 17(1), 3-28.
28. Dremelj P., Mrvar A., Batagelj V. (2002) Analiza rodoslova dubrovačkog vlas-
teoskog kruga pomoću programa Pajek. Anali Dubrovnik XL, HAZU, Zagreb,
Dubrovnik, 105-126 (in Croat).
29. Garfield E, Sher IH, and Torpie RJ.: The Use of Citation Data in Writing
the History of Science. Philadelphia: The Institute for Scientific Information,
December 1964. http://www.garfield.library.upenn.edu/papers/
useofcitdatawritinghistofsci.pdf
30. GEDCOM 5.5.
http://homepages.rootsweb.com/~pmcbride/gedcom/55gctoc.htm
31. Ghostscript, Ghostview and GSview http://www.cs.wisc.edu/~ghost/
32. Gibbons A. (1985) Algorithmic Graph Theory. Cambridge University Press.
33. Graph Drawing Contest 1997. http://vlado.fmf.uni-lj.si/pub/gd/gd97.htm
34. Grossman J. (2002) The Erdős Number Project.
http://www.oakland.edu/~grossman/erdoshp.html
35. Hlebec, V. (1993) Recall versus recognition: Comparison of two alternative
procedures for collecting social network data. Metodološki zvezki 9, Ljubljana:
FDV, 121-128.
36. Hummon, N.P. & Doreian, P. (1989) Connectivity in a citation network: The
development of DNA theory. Social Networks, 11, 39–63.
26 Vladimir Batagelj and Andrej Mrvar

37. Jones B. (2002). Computational geometry database.

http://compgeom.cs.uiuc.edu/~jeffe/compgeom/biblios.html
38. Kleinberg J. (1998) Authoritative sources in a hyperlinked environment. In
Proc 9th ACMSIAM Symposium on Discrete Algorithms, p. 668-677.
http://www.cs.cornell.edu/home/kleinber/auth.ps
http://citeseer.nj.nec.com/kleinberg97authoritative.html
39. Knuth, D. E. (1993) The Stanford GraphBase. Stanford University, ACM Press,
New York. ftp://labrea.stanford.edu/pub/sgb/
40. Mahnken, I. (1960) Dubrovački patricijat u XIV veku. Beograd, Naučno delo.
41. MDL Information Systems, Inc. (2002) http://www.mdli.com/
42. James Moody home page (2002) http://www.soc.sbs.ohio-state.edu/jwm/
43. Mrvar A., Batagelj V. (2000) Relational Calculator - a tool for analyzing social
networks. Metodološki zvezki 16, FDV, Ljubljana, 63-76.
44. Murtagh, F. (1985) Multidimensional Clustering Algorithms, Compstat lec-
tures, 4, Vienna: Physica-Verlag.
45. ODLIS (2002) Online dictionary of library and information science.
http://vax.wcsu.edu/library/odlis.html
46. Pajek’s datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/
47. D.M. Pennock etal. (2002) Winners dont’t take all, PNAS, 99/8, 5207-5211.
48. The R Project for Statistical Computing. http://www.r-project.org/
49. Richardson D.C., Richardson J.S. (2002) The Mage Page.
http://kinemage.biochem.duke.edu/index.html
50. Scott, J. (2000) Social Network Analysis: A Handbook, 2nd edition. London:
Sage Publications.
51. Seidman S. B. (1983) Network structure and minimum degree, Social Networks,
5, 269–287.
52. Tarjan, R. E. (1983) Data Structures and Network Algorithms. Society for
Industrial and Applied Mathematics Philadelphia, Pennsylvania.
53. UCINET (2002) http://www.analytictech.com/
54. Project Vega (2002) http://vega.ijp.si/
55. Wasserman S., Faust K. (1994) Social Network Analysis: Methods and Appli-
cations. Cambridge University Press, Cambridge.
56. Wasserman, S., and Pattison, P. (1996) Logit models and logistic regressions for
social networks: I. An introduction to Markov graphs and p∗ . Psychometrika,
60, 401-426. http://kentucky.psych.uiuc.edu/pstar/index.html
57. White D.R., Batagelj V., Mrvar A. (1999) Analyzing Large Kinship and Mar-
riage Networks with Pgraph and Pajek. Social Science Computer Review, 17
(3), 245-274
58. Wilson, R.J., Watkins, J.J. (1990) Graphs: An Introductory Approach. New
York: John Wiley and Sons.
59. Yuen Ho, et.al. (2002) Systematic identification of protein complexes in
Saccharomyces cerevisiae by mass spectrometry. Nature, vol 415, 180-183.
http://www.mshri.on.ca/tyers/pdfs/proteome.pdf
60. Zaveršnik M., Batagelj V., Mrvar A. (2002) Analysis and visualization of 2-
mode networks. Proceedings of Sixth Austrian, Hungarian, Italian and Slove-
nian Meeting of Young Statisticians, October 5-7, 2001, Ossiach, Austria. Uni-
versity of Klagenfurt, p. 113-123.

Sword Art Online - Kiss and Fly, Vol. 22
No ratings yet
Sword Art Online - Kiss and Fly, Vol. 22
216 pages
Admission of Patient For Surgery
No ratings yet
Admission of Patient For Surgery
5 pages
Building For Future Generations First Principles of True Reed, Jeff
No ratings yet
Building For Future Generations First Principles of True Reed, Jeff
68 pages
47 & 01 Project Report
No ratings yet
47 & 01 Project Report
19 pages
Graph Algorithms
100% (10)
Graph Algorithms
500 pages
Robert Waelder Five Lectures
No ratings yet
Robert Waelder Five Lectures
68 pages
Greater Outsourcing and Offshoring of Production in Puma
No ratings yet
Greater Outsourcing and Offshoring of Production in Puma
8 pages
Thermal Engineering-2 Question Bank: Unit Wise
No ratings yet
Thermal Engineering-2 Question Bank: Unit Wise
12 pages
Pajek - Analysis & Visualization of Large Networks
No ratings yet
Pajek - Analysis & Visualization of Large Networks
5 pages
MIT14 15JF09 Pajek
No ratings yet
MIT14 15JF09 Pajek
29 pages
9.chapter 2
No ratings yet
9.chapter 2
12 pages
An Introduction To Graph Theory in Complex Systems Studies: Why Use A Graph-Theoretic Representation?
No ratings yet
An Introduction To Graph Theory in Complex Systems Studies: Why Use A Graph-Theoretic Representation?
10 pages
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
No ratings yet
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
7 pages
Algorithm Design Unit 3
No ratings yet
Algorithm Design Unit 3
4 pages
The Structure and Function of Complex Networks: M. E. J. Newman
No ratings yet
The Structure and Function of Complex Networks: M. E. J. Newman
58 pages
Introduction To Graph Theory: March 2011
No ratings yet
Introduction To Graph Theory: March 2011
11 pages
BookChapter - 11 Good One PDF
No ratings yet
BookChapter - 11 Good One PDF
39 pages
HCMUT MATHS4CS 055263 Assignment Community Structure Identification IMP
No ratings yet
HCMUT MATHS4CS 055263 Assignment Community Structure Identification IMP
10 pages
Graph Theory - Introduction
No ratings yet
Graph Theory - Introduction
5 pages
Basics of Network Analysis
No ratings yet
Basics of Network Analysis
38 pages
Social Network Analysis
No ratings yet
Social Network Analysis
38 pages
A Survey On Network Embedding
No ratings yet
A Survey On Network Embedding
20 pages
Mit14 15s22 Lec2
No ratings yet
Mit14 15s22 Lec2
39 pages
Graph Theory and Its Applications in Computer Science and Engineering
No ratings yet
Graph Theory and Its Applications in Computer Science and Engineering
4 pages
A Survey On Network Embedding
No ratings yet
A Survey On Network Embedding
21 pages
Topic 1 - Graphs
No ratings yet
Topic 1 - Graphs
14 pages
Lecture 02
No ratings yet
Lecture 02
3 pages
Graph Mining Handout
No ratings yet
Graph Mining Handout
7 pages
Graph Theoryandits Applicationsin Computer Scienceand Engineering
No ratings yet
Graph Theoryandits Applicationsin Computer Scienceand Engineering
4 pages
Graph Theory
No ratings yet
Graph Theory
89 pages
Bridging Disciplines With Graph Theory: Insights From Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya
No ratings yet
Bridging Disciplines With Graph Theory: Insights From Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya
4 pages
13 Network Models: Nadine Baumann and Sebastian Stiller
No ratings yet
13 Network Models: Nadine Baumann and Sebastian Stiller
32 pages
Mathematics-2
No ratings yet
Mathematics-2
10 pages
Expanding Horizons Graph Theorys Multifaceted App
No ratings yet
Expanding Horizons Graph Theorys Multifaceted App
9 pages
Research Data
No ratings yet
Research Data
282 pages
Module Visualization
No ratings yet
Module Visualization
11 pages
Data Mining and BI: Social Network Analytics: Credits: Lada Adamic
No ratings yet
Data Mining and BI: Social Network Analytics: Credits: Lada Adamic
34 pages
Mapping and Visualization of A Rese
No ratings yet
Mapping and Visualization of A Rese
6 pages
MMW HW05
No ratings yet
MMW HW05
4 pages
Graph Theory in The Information Age
No ratings yet
Graph Theory in The Information Age
13 pages
Introduction To Networks: 15.053 March 22, 2007
No ratings yet
Introduction To Networks: 15.053 March 22, 2007
60 pages
Menendez Llorente
No ratings yet
Menendez Llorente
22 pages
DSP Unit 5
No ratings yet
DSP Unit 5
33 pages
Introduction To Datascience (R20DS501)
No ratings yet
Introduction To Datascience (R20DS501)
23 pages
Optimizing Urban Accessibility Constructing A
No ratings yet
Optimizing Urban Accessibility Constructing A
13 pages
Ai 222
No ratings yet
Ai 222
39 pages
Social
No ratings yet
Social
67 pages
Comments To The School Network in Gephi - JM
No ratings yet
Comments To The School Network in Gephi - JM
6 pages
A Graph 2
No ratings yet
A Graph 2
17 pages
Graphs in Libraries: A Primer: Part 1. Introduction To Graph Theory
No ratings yet
Graphs in Libraries: A Primer: Part 1. Introduction To Graph Theory
13 pages
Applications of Graph Theory in Science and Computer Science
No ratings yet
Applications of Graph Theory in Science and Computer Science
4 pages
‎⁨خالد graph theory⁩
No ratings yet
‎⁨خالد graph theory⁩
46 pages
Physica A: Michael J. Bommarito II, Daniel Martin Katz, Jonathan L. Zelner, James H. Fowler
No ratings yet
Physica A: Michael J. Bommarito II, Daniel Martin Katz, Jonathan L. Zelner, James H. Fowler
8 pages
A K-Shell Decomposition Method For Weighted Networks: Home Search Collections Journals About Contact Us My Iopscience
No ratings yet
A K-Shell Decomposition Method For Weighted Networks: Home Search Collections Journals About Contact Us My Iopscience
15 pages
Lecture3
No ratings yet
Lecture3
18 pages
Newman Survey
No ratings yet
Newman Survey
91 pages
Porter - Communities in Networks
No ratings yet
Porter - Communities in Networks
19 pages
3.1-Forms of Representation
No ratings yet
3.1-Forms of Representation
18 pages
Unit-4 Graphs Notes
No ratings yet
Unit-4 Graphs Notes
35 pages
Chapter 2. Graph Theory and Concepts: Figure 2-1
No ratings yet
Chapter 2. Graph Theory and Concepts: Figure 2-1
18 pages
PPT10-W10-Graph Analytics For Big Data
No ratings yet
PPT10-W10-Graph Analytics For Big Data
55 pages
(IJIT-V7I2P8) :D. Jeni Jeba Seeli, Dr.K.K.Thanammal
No ratings yet
(IJIT-V7I2P8) :D. Jeni Jeba Seeli, Dr.K.K.Thanammal
6 pages
Complex Networks
No ratings yet
Complex Networks
145 pages
GML 1
No ratings yet
GML 1
32 pages
The Braveheart Report #Issue002
100% (1)
The Braveheart Report #Issue002
21 pages
Elliott Wave Cheat Sheet FINAL PDF
No ratings yet
Elliott Wave Cheat Sheet FINAL PDF
6 pages
Chick Corea Got A Match
No ratings yet
Chick Corea Got A Match
2 pages
Do Stocks Outperform Treasury Bills - 58% Do Not
No ratings yet
Do Stocks Outperform Treasury Bills - 58% Do Not
40 pages
VERBS in English CG
No ratings yet
VERBS in English CG
3 pages
Java - Util.scanner: Import Public Class Public Void New - Int New
No ratings yet
Java - Util.scanner: Import Public Class Public Void New - Int New
21 pages
Lost Boy
No ratings yet
Lost Boy
8 pages
What Is A Selection?: Characteristics of Selections
No ratings yet
What Is A Selection?: Characteristics of Selections
7 pages
How To Manage Your Time Like A CEO
No ratings yet
How To Manage Your Time Like A CEO
47 pages
Scientific Investigation PDF File
No ratings yet
Scientific Investigation PDF File
27 pages
Republic v. Court of Appeals, G.R. No. 108998, August 24, 1994
No ratings yet
Republic v. Court of Appeals, G.R. No. 108998, August 24, 1994
18 pages
Phonemic Orthography
No ratings yet
Phonemic Orthography
9 pages
VSR 411 QB Anaesthesia
No ratings yet
VSR 411 QB Anaesthesia
7 pages
What Is Abstract About The Art of Music - K. L. Walton (1988)
100% (1)
What Is Abstract About The Art of Music - K. L. Walton (1988)
15 pages
Nov 20 - Lesson Plan - Hot Air Balloon
No ratings yet
Nov 20 - Lesson Plan - Hot Air Balloon
2 pages
Henry Sy
No ratings yet
Henry Sy
1 page
I Can : Fluency Practice To Improve Basic Skills K-1
No ratings yet
I Can : Fluency Practice To Improve Basic Skills K-1
14 pages
Apologético PDF
No ratings yet
Apologético PDF
4 pages
Diliman Preparatory School 3 Quarter Exam. (Reviewer) Language 6 Name: - Date
No ratings yet
Diliman Preparatory School 3 Quarter Exam. (Reviewer) Language 6 Name: - Date
2 pages
PEEEEE
80% (5)
PEEEEE
1 page
Contents 2
No ratings yet
Contents 2
147 pages
PPL 1
No ratings yet
PPL 1
17 pages
Jollibee Food Corporat ION: (Case Study Sample)
100% (2)
Jollibee Food Corporat ION: (Case Study Sample)
20 pages
Political Economy The Contest of Economic Ideas 2nd Edition Frank Stilwell Instant Download
No ratings yet
Political Economy The Contest of Economic Ideas 2nd Edition Frank Stilwell Instant Download
84 pages
Answers To Exercises For Section 2.2
No ratings yet
Answers To Exercises For Section 2.2
7 pages
Chapter 4 FULL CLASS NOTES
No ratings yet
Chapter 4 FULL CLASS NOTES
92 pages
Risk Assessment
No ratings yet
Risk Assessment
26 pages
8 Ways To Build Collaborative Teams
100% (1)
8 Ways To Build Collaborative Teams
26 pages

Pajek Large Networks Paper

Uploaded by

Pajek Large Networks Paper

Uploaded by

U NIVERSITY OF L JUBLJANA

I NSTITUTE OF M ATHEMATICS , P HYSICS AND M ECHANICS

Preprint series, Vol. 41 (2003), 871

Version: March 4, 2003

Math.Subj.Class.(2000): 05 C 90, 68 R 10, 76 M 27, 68 U 05,

Supported by the Ministry of Education, Science and Sport of Slovenia,

Ljubljana, March 14, 2003

Vladimir Batagelj1 and Andrej Mrvar2

Pajek is a program, for Windows, for analysis and visu-

Fig. 1. Approaches to deal with large networks

• temporal networks, dynamic graphs – networks changing over time.

In this chapter we present the main characteristics of Pajek. Since large

Freiburg, Madrid, and some others as a support in courses on network anal-

3.1 Citation weights

which determines a citation network N = (U, R).

n(u, v) = n− (u) · n+ (v), (u, v) ∈ R

Fig. 2. Part of SOM main subnetwork at level 0.001

Fig. 3. 0, 1, 2 and 3 core

and if C is a minimal arc-cut-set

3.2 Cores and generalized cores

Fig. 4. pS -core at level 46 of Geomlib network

Algorithm 1: Core Numbers Algorithm

pM (v, U ) = max w(v, u), where w : E → IR

pk (v, U ) = number of cycles of length k through vertex v in (U, E|U )

Michael/Zrieva/ Junius/Georgio/ Nicola/Ragnina/ Marinus/Zrieva/

Fig. 5. Marriages among relatives in Ragusa

The p-function is local iff

3.3 Pattern searching

1 - 003 2 - 012 3 - 102 4 - 021D

5 - 021U 6 - 021C 7 - 111D 8 - 111U

9 - 030T 10 - 030C 11 - 201 12 - 120D

13 - 120U 14 - 120C 15 - 210 16 - 300

genealogy is represented as a p-graph. A solid arc indicates the is a son of

Several properties of a graph can be expressed in terms of its triadic spectrum

3.5 Triangular connectivities

WORMALD, NICHOLAS C. LASKAR, RENU C. SHELAH, SAHARON

MCKAY, BRENDAN D. HEDETNIEMI, STEPHEN T. MAGIDOR, MENACHEM

CHUNG, FAN RONG K.

Fig. 7. Edge-cut at level 16 of triangular network of Erdős collaboration graph

Let G be a simple undirected graph. A triangular network NT (G) =

cyc tra in out

wcyc (u, v) = |outadj(v) ∩ inadj(u)|

In directed graphs we distinguish weak and strong connectivity. The weak

A subgraph H = (V 0 , A0 ) of G is cyclic triangular if each its vertex and

library binding title

Fig. 8. Edge-cut at level 11 of transitive network of ODLIS dictionary graph

3.6 Generating large random networks

It is easy to write a program to do this:

Algorithm 2: Sparse Erdős-Rényi random graph generator

The same approach is easy to adapt to generate different types of random

3.7 2-mode networks

A 2-mode network is a structure N = (U, V, A, w), where U and V are disjoint

The normalization approach was developed for quick inspection of (1-mode)

Fig. 9. GeoDeg normalization of Reuters terror news network

Table 1. Weight normalizations

In Figure 9 a part of ‘themes’ from Reuters terror news network [14]

In Figure 10 the Snyder and Kick’s world trade network is presented by

Fig. 11. Blockmodeling

A blockmodel consists of structures obtained by identifying all units from

Determine the clustering C? ∈ Φ for which

P (C? ) = min P (C)

Fig. 12. Block Types

Fig. 13. A Symmetric Acyclic Blockmodel of Student Government

Fig. 14. Pajek’s Main Window

4.2 Implemented algorithms

4.3 Layout Algorithms and Layout Features

Special emphasis is given in Pajek to automatic generation of network lay-

Several examples of applications of Pajek were already presented as illustra-

Fig. 15. 3D layout obtained using eigenvectors

Pajek is implemented in Delphi and runs on Windows operating systems.

Fig. 16. GD’97 contest graph A in VRML

1. Adobe SVG Viewer (2002) http://www.adobe.com/svg/viewer/install/

37. Jones B. (2002). Computational geometry database.

You might also like