Social Network Analysis Report
Social Network Analysis Report
Seminar Report Submitted in Partial Fulfilment of the Requirements for the Degree of
Bachelor of Engineering
in
Computer Science and Engineering
Submitted by
Ratnesh shah: (Roll No. 16cse33031)
Seminar Report Submitted in Partial Fulfilment of the Requirements for the Degree of
Bachelor of Engineering
in
Computer Science and Engineering
Submitted by
Ratnesh shah: (Roll No. 16cse33031)
CERTIFICATE
This is to certify that the work contained in this seminar report entitled “Social Network
Analysis” is submitted by Mr. Ratnesh shah (Roll. No: 16cse33031) to the Department of
Computer Science & Engineering, M.B.M. Engineering College, Jodhpur, for the partial
fulfilment of the requirements for the degree of Bachelor of Engineering in Computer
Science and Engineering.
He has carried out his work under my supervision. This work has not been submitted else-
where for the award of any other degree or diploma.
The seminar report work in my opinion, has reached the standard fulfilling of the
requirements for the degree of Bachelor of Engineering in Computer Science and
Engineering in accordance with the regulations of the Institute.
ii
DECLARATION
I, Ratnesh shah hereby declare that this seminar report titled “Social Network Analysis” is
a record of original work done by me under the supervision and guidance of Mr. Abhisek
Gour.
I, further certify that this work has not formed the basis for the award of the
Degree/Diploma/Associateship/Fellowship or similar recognition to any candidate of any
university and no part of this report is reproduced as it is from any other source without
appropriate reference and permission.
SIGNATURE OF STUDENT
(Ratnesh shah)
VIIth Semester, CSE
Enroll. – 15R/0005584
Roll No. – 16cse33031
iii
ACKNOWLEDGEMENT
I(Ratnesh shah) take immense pleasure in thanking Dr. Anil Gupta, Head Of Department
computer science and Engineering , MBM Engineering College,Jodhpur for permitted me to
carry out this seminar work and support and facilities made available.
I wish to express our deep sense of gratitude to Mr. Abhisek Gour,Mentor,for his able
guidance and useful suggestions, which helped me in completing the seminar work,in time.
His guidance, encouragement, suggestions and very constructive criticism have contributed
immensely to the evolution of my report work. I am highly benefited by this work and gained
a lot of knowledge.
Finally, yet most importantly, I would like to express our heartfelt thanks to my beloved
parents and family for their blessings, wishes and support for the successful completion of
this work.
iv
ABSTRACT
A social network is defined as a social structure of individuals, who are related (directly or
indirectly to each other) based on a common relation of interest, e.g. friendship, trust, etc.
Social network analysis is the study of social networks to understand their structure and
behavior. Social network analysis has gained prominence due to its use in different
applications - from product marketing (e.g. viral marketing) to search engines and
organizational dynamics (e.g. management). Recently there has been a rapid increase in
interest regarding social network analysis in the data community. This report will provide an
up-to-date introduction to the increasingly important field of data in social network analysis,
and a brief overview of research directions in this field. We first provide an introduction to
social network analysis and then briefly survey the research in this field. Next, an overview
of network property and link analysis for social network analysis is presented. Finally, we
will present social network application and future direction.
v
Contents
DECLARATION ..................................................................................................................... iii
ACKNOWLEDGEMENT ........................................................................................................ iv
ABSTRACT ............................................................................................................................... v
Chapter 1 .................................................................................................................................... 1
INTRODUCTION .................................................................................................................. 1
Chapter 2 .................................................................................................................................. 11
Chapter 3 .................................................................................................................................. 19
vi
Chapter 4 .................................................................................................................................. 32
Chapter 5 .................................................................................................................................. 39
CONCLUSION .................................................................................................................... 39
REFERENCES ..................................................................................................................... 40
vii
List of Figures
Figure 1.9 : Sociomatrix of Directed Network of Friendship Ties Between Managers ... 8
Figure 3.14 : Degree Centrality for Zachary’s Karate Club Network ............................ 20
Figure 3.15 : Betweenness Centrality for Zachary’s Karate Club Network ................... 21
Figure 3.16 : Closeness Centrality for Zachary’s Karate Club Network ........................ 22
Figure 3.17 : Eigenvector Centrality for Zachary’s Karate Club Network ..................... 23
viii
Figure 3.22 : Structural Equivalence .............................................................................. 26
ix
Chapter 1
INTRODUCTION
Social network analysis (SNA) is the process of investigating social structures through the
use of networks and graph theory. It characterizes networked structures in terms
of nodes (individual actors, people, or things within the network) and the ties, edges,
or links (relationships or interactions) that connect them. Examples of social
structures commonly visualized through social network analysis include social media
networks, memes spread, information circulation, friendship and acquaintance networks,
business networks, social networks, collaboration graphs, kinship, disease transmission,
and sexual relationships. These networks are often visualized through sociograms in which
nodes are represented as points and ties are represented as lines. These visualizations provide
a means of qualitatively assessing networks by varying the visual representation of their
nodes and edges to reflect attributes of interest.
Social network analysis has emerged as a key technique in modern sociology. It has also
gained a significant following in anthropology, biology, demography, communication
studies, economics, geography, history, information and social science, organizational
studies, political science, social psychology, development studies, sociolinguistics,
and computer science and is now commonly available as a consumer tool.
Social networking sites (e.g., MySpace and Facebook) are popular online communication
forms among adolescents and emerging adults. Yet little is known about young people's
activities on these sites and how their networks of “friends” relate to their other online (e.g.,
instant messaging) and offline networks. In this study, college students responded, in person
and online, to questions about their online activities and closest friends in three contexts:
Social Network Analysis
social networking sites, instant messaging, and face-to-face. Results showed that participants
often used the Internet, especially social networking sites, to connect and reconnect with
friends and family members. Hence, there was overlap between participants' online and
offline networks. However, the overlap was imperfect; the pattern suggested that emerging
adults may use different online contexts to strengthen different aspects of their offline
connections. Information from this survey is relevant to concerns about young people's life
online.
1.2 Metrices
1.2.1. Connections
Homophily: The extent to which actors form ties with similar versus dissimilar others.
Similarity can be defined by gender, race, age, occupation, educational achievement, status,
values or any other salient characteristic. Homophily is also referred to as assortativity.
Multiplexity: The number of content-forms contained in a tie. For example, two people who
are friends and also work together would have a multiplexity of 2. Multiplexity has been
associated with relationship strength.
Mutuality/Reciprocity: The extent to which two actors reciprocate each other's friendship or
other interaction.
INTRODUCTION 2
Social Network Analysis
1.2.2. Distributions
Bridge: An individual whose weak ties fill a structural hole, providing the only link between
two individuals or clusters. It also includes the shortest route when a longer one is unfeasible
due to a high risk of message distortion or delivery failure.
Centrality: Centrality refers to a group of metrics that aim to quantify the "importance" or
"influence" (in a variety of senses) of a particular node (or group) within a
network. Examples of common methods of measuring "centrality" include betweenness
centrality, closeness centrality, eigenvector centrality, alpha centrality, and degree centrality.
Density: The proportion of direct ties in a network relative to the total number possible.
Distance: The minimum number of ties required to connect two particular actors, as
popularized by Stanley Milgram's small world experiment and the idea of 'six degrees of
separation'.
Structural holes: The absence of ties between two parts of a network. Finding and exploiting
a structural hole can give an entrepreneur a competitive advantage. This concept was
developed by sociologist Ronald Burt, and is sometimes referred to as an alternate conception
of social capital.
Tie Strength: Defined by the linear combination of time, emotional intensity, intimacy and
reciprocity (i.e. mutuality). Strong ties are associated with homophily, propinquity and
transitivity, while weak ties are associated with bridges.
1.2.3. Segmentation
Groups are identified as 'cliques' if every individual is directly tied to every other individual,
'social circles' if there is less stringency of direct contact, which is imprecise, or
as structurally cohesive blocks if precision is wanted.
Actor:
Actor also called a node or a vertex, referrers to an individual hat can have relationships with
other individuals and in this case, an individual or group of individuals we are choosing to
study.
INTRODUCTION 3
Social Network Analysis
Tie:
Tie also called a relation or edge, describes a particular, well specified, relationship between
two Actors. This could refer to a relationship like “went to the same school” or “likes potato
chips” or something like “likes” or “trades with”. Ties can be un-directed (like went to the
same school), when the relationship means the same thing to both actors:
Ties can also be directed (such as “looks up to”) and either one directional or bidirectional:
INTRODUCTION 4
Social Network Analysis
Network:
Network also called a Graph, particularly in the physics and CS literature, referres to a
collection of Actors and the Ties between them. Figure 1.2depicts a set of unidirected
friendship relationships between members of a Karate club
Multiplex networks:
Multiplex networks are networks where more than one kind of tie is present. For example, if
we were to collect information about several different kinds of relationships between bank
managers (goes to for advice, is friends with, works for, etc.) we essentially end up with a
network containing multiple tie types between actors.
INTRODUCTION 5
Social Network Analysis
Weighted Ties:
Weighted Ties just as networks can contain multiple different kinds of edges between actors,
they can also contain relationships of varying strength. For example A might like B a whole
lot, but B and C only like each other moderately.
Group:
A group in a network is just a subset of the actors which share some characteristic in
common. If we were to look at an organizational network, one group could be made up of all
actors that work in the human resources department. The definition of groups as commonality
on some salient trait allows us to examine a number of network hypotheses and defined
useful measures that are conditional on knowing the group membership of actors. For
example we might want to test a hypothesis about the number of friendship ties between
workers at a company who are part of different departments versus those in the same
departments.
Geodesic distance:
INTRODUCTION 6
Social Network Analysis
Geodesic distance is defined as the least number of connections(ties) that must be traversed to
get between any two nodes. For example, in the network depicted below, the geodesic
distance between actor A and actor D is 3, while the distance between actor B and C is only
1.
Social Network Data There are two main kinds of social network data: edge lists and
sociomatricies. Each of these data formats has its own advantages and weaknesses, mainly
having to do with a trade off between ease of entering and storing the data and ease of using
the data for analysis.
Each row in the sociomatrix represents the ties that Actor i sends to all other actors (j’s). As
we notice in figure , manager one sends a directed friendship tie to manager two, as indicated
by the value 1 in the [1,2] entry of the sociomatrix. The upside of taking this approach to
storing data about a network is that it naturall encodes the fact that some actors may not send
or receive any ties (something we call being a network isolate) and the format is very ready
INTRODUCTION 7
Social Network Analysis
for many statistical analyses. The downside to using this data format is that it can take up a
lot of space and be difficult to enter data into by hand.
An Edgelist is the other primary form of data storage for social network analysis. This only
captures information about existing ties so it needs to be supplemented with knowledge of the
total number of actors in the network (even if they do not have any ties). In the example
edgelist in Figure, directed friendship ties for the network shown in Figure are presented in
edgelist form where the first number on each line denotes the actors sending a tie to the
second actor in the row.
This form of data entry is best for storing information about data that are collected by hand as
it is very efficient to store and relatively easy to enter, but one must be careful to use a
common naming system and keep track of any nodes that do not have any ties to them.
INTRODUCTION 8
Social Network Analysis
There are several key terms associated with social network analysis research in computer-
supported collaborative learning such as: density, centrality, indegree, outdegree,
and sociogram.
Density :
Centrality :
INTRODUCTION 9
Social Network Analysis
INTRODUCTION 10
Chapter 2
SOCIAL NETWORK THEORIES
Social network analysis has its theoretical roots in the work of early sociologists such
as Georg Simmel and Émile Durkheim, who wrote about the importance of studying patterns
of relationships that connect social actors. Social scientists have used the concept of "social
networks" since early in the 20th century to connote complex sets of relationships between
members of social systems at all scales, from interpersonal to international. In the
1930s Jacob Moreno and Helen Jennings introduced basic analytical methods. In 1954, John
Arundel Barnes started using the term systematically to denote patterns of ties, encompassing
concepts traditionally used by the public and those used by social scientists:
bounded groups (e.g., tribes, families) and social categories (e.g., gender, ethnicity). Scholars
such as Ronald Burt, Kathleen Carley, Mark Granovetter, David Krackhardt, Edward
Laumann, Anatol Rapoport, Barry Wellman, Douglas R. White, and Harrison
White expanded the use of systematic social network analysis. Even in the study of literature,
network analysis has been applied by Anheier, Gerhards and Romo, Wouter De Nooy, and
Burgert Senekal. Indeed, social network analysis has found applications in various academic
disciplines, as well as practical applications such as countering money
laundering and terrorism.
One of the important observations made by social scientists is a tendency in social groups for
similar people to be connected together (after all ‘birds of a feather flock together’). It has a
significant impact on the value we get from social media (as often we hear similar voices and
interact with like-minded people).
Social Network Analysis
This phenomenon is called Homophily (meaning love of the same) (McPherson et al. 2001).
Homophily can be directly observed in the virtual worlds using analytical techniques, for
example Huang et al (2009) showed that in the Massive Online Role-Playing Game
Everquest players tended to interact with other players of similar age, experience and who
lived near them in the real world. This held across all sorts of interactions, from questing
together to trading in the in game auction house. In fact the only way they looked for
Homophily and didn’t find it was in gender, something that they put down to the fact that
32% of people play the game with a romantic partner.
Homophily has predictive power in social media, so much so that researchers looking at
last.fm could predict real-life friendships by examining on-line interaction, shared interests
and location (Bischoff, 2012).
In Twitter De Choudhury (2011) has shown that different types of homophily hold for
different types of users (for example, normal users with roughly the same number of
followers and followed have location and sentiment homophily - i.e. they tend to live and
work near each other, and show similar reactions and views).
Homophily is a good example of where an existing social theory can now be explored
numerically, and be easily verified in a wide variety of different networks, because the data is
held digitally.
Six degrees of separation is the theory that any person on the planet can be connected to any
other person on the planet through a chain of acquaintances that has no more than five
intermediaries. The concept of six degrees of separation is often represented by a graph
database, a type of NoSQL database that uses graph theory to store, map and query
relationships.Real-world applications of the theory include power grid mapping and analysis,
disease transmission mapping and analysis, computer circuitry design and search engine
ranking.
The six degrees of separation theory was first proposed in 1929 by the Hungarian writer
Frigyes Karinthy in a short story called "Chains." In the 1950s, Ithiel de Sola Pool (MIT) and
Manfred Kochen (IBM) set out to prove the theory mathematically. Although they were able
to phrase the question mathematically (given a set N of people, what is the probability that
each member of N is connected to another member via k_1, k_2, k_3...k_n links?), after
twenty years they were still unable to solve the problem to their satisfaction.
In 1967, American sociologist Stanley Milgram devised a new way to test the theory, which
he called "the small-world problem." Milgram randomly selected people in the midwest to
send packages to a stranger located in Massachusetts. The senders knew the recipient's name,
occupation and general location. Each participant was instructed to send the package to a
person he knew on a first-name basis who was most likely, out of all the participant's friends,
to know the target personally. That person would do the same, and so on until the package
was personally delivered to its target recipient. Although participants expected the chain to
include at least a hundred intermediaries, it only took (on average) between five and seven
intermediaries for each package to be delivered successfully.
Milgram's findings were published in Psychology Today and inspired the phrase "six degrees
of separation." Playwright John Guare popularized the phrase when he chose it as the title for
his 1990 play. Although Milgram's findings were discounted after it was discovered that he
based his conclusion on a very small number of packages, six degrees of separation became
an accepted notion in pop culture after Brett C. Tjaden published a computer game on the
University of Virginia's Web site based on the small-world problem.
Tjaden used the Internet Movie Database (IMDB) to document connections between different
actors. The game, which asked web site visitors to guess the number of connections between
the actor Kevin Bacon and any other actor in the dataset, was called The Oracle of Bacon at
Virginia. Time magazine selected it as one of the "Ten Best Web Sites of 1996."
In 2001, Duncan Watts, a professor at Columbia University, continued his earlier research
into the phenomenon and recreated Milgram's experiment on the Internet. Watts used an
email message as the "package" that needed to be delivered, and surprisingly, after reviewing
the data collected by 48,000 senders and 19 targets (in 157 countries), Watts found that the
average number of intermediaries was indeed six.
In 2008, Microsoft attempted to validate the experiment by analyzing the minimum chain
length it would take to connect 180 billion different pairs of users in the Microsoft Messenger
database. According to Microsoft's finding, the average chain length was 6.6 hops. In 2016,
researchers at Facebook reported that the social networking site had reduced the chain length
of its members to three and a half degrees of separation. Dutch mathematician Edsger
Dijkstra is credited with developing the algorithm that made it possible for Facebook
researchers and others to find the shortest path between two nodes in a graph database.
Weak tie theory is the proposition that acquaintances are likely to be more influential than
close friends, particularly in social networks.
Weak tie theory derives from Nick Granovetter's 1973 article "The Strength of Weak Ties,"
which was about the spread of information through social networks. At that time, social
networking happened almost entirely in the physical world. However, many early social
network theories have since been demonstrated naturally through social media and in many
cases, sites such as Facebook, LinkedIn and Twitter have accelerated the processes involved.
Granovetter categorized interpersonal ties as strong, weak or absent. A strong tie is someone
within a close circle of family and friends. Strong ties are essential for real community but
they are typically groups with a great deal of similarity and, as such, less likely than more
tenuous connections to carry new information and perspectives to their groups.
Social media influencers are prime examples of weak ties. They typically have large groups
of followers and their impact is also distributed among the networks of those followers.
(Weak ties that connect social networks are sometimes called bridges.) Absent ties are
connections that might be expected to exist but don't. For example, it might be assumed that
two prominent writers in a given genre would be connected. An absent tie is the lack of a
connection between such people. As a rule, an absent tie can be transformed to a weak tie
fairly easily. Similarly, an absent tie or a weak tie could become a strong tie through
interaction.
Because networks of strong ties are self-limiting, they can lead to what is sometimes called
a filter bubble: A restriction of news, information and ideas that results from things like
search personalization and maintaining connections mostly within homogenous groups of
people. The limitation can stem from confirmation bias, which is the human tendency to seek
out sources of information that support our existing perspective and beliefs. A larger social
network including numerous weak ties, on the other hand, is likely to challenge that tendency
and support critical thinking.
Within the enterprise, a department or a project team could be considered a group of strong
ties. According to weak tie theory, encouraging intergroup communication
and collaborationis likely to increase the dissemination of ideas and information and promote
creativity and innovation. Creating more connections among employees will increase the
flow of ideas -- and that is especially true for employees that have no apparent need to
communicate. Promoting the creation of weak ties might generate revenue-generating
opportunities, cost-cutting strategies, recommendations for productivity enhancement,
improvements in product development, among endless other possibilities.
The phrase “social network” has been used in recent years to denote the virtual
communities of users of websites such as Facebook, Twitter and the like. Thus, the
However, well before the advent of the Internet, quantitative sociology used this term to
designate sets of individuals connected by more traditional links, mostly personal contacts,
entailing physical proximity [16]. For instance, an adult living and working in a city has
social contacts, though with different frequencies, with his relatives, coworkers, friends,
public transportation users, and so on. Clearly, this is the kind of social network we are
interested in, to understand and model the processes involved in epidemics, most of which
spread by contact between individuals: some by simple proximity (for instance, sneezing or
shaking hands can transmit flu or measles), other ones by less casual contacts (as in
sexually transmitted diseases, or the spread of hepatitis or HIV through infected material).
The connection between epidemiology and mathematics goes back a long way. The first
significant use of a mathematical model dates back to 1927, when Kermack and
McKendrick suggested a system of differential equations in order to explain the evolution
data of a plague epidemic in India [7]. Their crucial assumption lay in describing the
population as a kind of perfectly mixed gas, in which every individual is equivalent to
every other and has, in a fixed time interval, the same probability of meeting any other
individual. It is through the meeting of a healthy and not immune individual
(called susceptible, S) and of an infected one (called infective, I) that the infection can be
transmitted, with a probability depending on the epidemic’s virulence: the susceptible
person becomes infected.
This modelling approach was the main one in mathematical epidemiology for about
70 years, with countless variations and extensions that made it possible, for instance, to
adapt the model to different pathogens or to refine it by distinguishing individuals
according to age or sex [1]. Let us see an example, one of the simplest: in the SIS model it
is assumed that each individual may get infected and recover, passing through the
transitions S → I → S··· an unlimited number of times. If we denote by y(t) the proportion
of infectives in the population at time t, so that s(t) = 1 − y(t) is the proportion of
susceptibles, we have:
y˙(t)=−γy(t)+βy(t)s(t)=−γy(t)+βy(t)(1−y(t))
The first summand on the right-hand side represents the flow I → S of recovered people
(1/γ is the average time after which an individual recovers), while the second summand is
the crucial one to describe the S → I process of infection: the probability of each S meeting
an I is proportional to the density y(t) of the latter, so the number of S → I transitions is
proportional to the product y(t)s(t). The transmission coefficient β depends on both the
social structure, which dictates the frequency of meetings, and the virulence of the
pathogen.
What does the SIS model predict? It is easy to verify that there exist two equilibrium
solutions (Fig. 1): y = 0, which corresponds trivially to the absence of infectives,
and y = 1 - γ/β, which is positive and asymptotically stable if β > β cr = γ, while
for β < β cr the solution y = 0 is the asymptotically stable one. We may deduce that the
number of infected people grows with β, as we could have guessed, but, more importantly,
that there exists a threshold value β cr under which the epidemic is bound to die out. Indeed,
if β is too small, the few infected people in the population are not able to generate a
sufficient number of other infected individuals, due to the scarcity of contacts or the low
virulence of the pathogen. The existence of this threshold has always been one of the
cornerstones of epidemiological modelling [1].
In the classical SIS model, the proportion of infectives at the equilibrium grows with the
transmission coefficient β, but the epidemic dies out (that is, y → 0) if β < β cr
Now that we have some basic terminology down, we can get into the heart of actor level
properties that serve as the language for social network analysis. I am going to spend a
majority of my time in this section explaining how to conceptualize social phenomena and
hypotheses in a networks framework without going into too much detail on substantive
theories of relational phenomena. The goal is to help you be literate enough to interface with
and understand theories posed in the literature using a social networks/ relational framework.
Degree Centrality:
is the most basic network measure and captures the number of ties to a given actor. For un-
directed ties this is simply a count of the number of ties for every actor. For directed
networks, actors can have both indegree and outdegree centrality scores. As the name
implies, centrality measures how central or well connected an actor is in a network. This
theoretically signals importance or power and increased access to information or just general
activity level and high degree centrality is generally considered to be an asset to an actor.
Degree centrality is depicted for the Karate club network in Figure where each actor is now
labeled with their undirected degree centrality score.
Social Network Analysis
Betweenness Centrality:
Betweenness centrality is roughly defined as the number of shortest paths between alters that
go through a particular actor. More precisely, it is the sum of [the shortest path lengths
between every set of alters where the path goes through the actor we are calculating the
measure for divided by the shortest path lengths(not necessarily through the target
actor)betweenthoseactors].Thisintuitively measures the degree to which information or
relationships have to flow through a particular actor and their relative importance as an
intermediary in the network. Betweenness scores for Zachary’s Karate club network are
displayed in figure.
Closeness centrality:
Closeness centrality measures how many steps (ties) are required for a particular actor to
access every other actor in the network. This is measured as 1 divided by the sum of geodesic
distances from an actor to all alters in the network. The measure will reach its maximum for a
given network size when an actor is directly connected to all others in the network and its
minimum when an actor is not connected to any others. This captures the intuition that short
path lengths between actors signal that they are closer to each other. Note that this measure is
sensitive to network size and is decreasing in the number of actors in the network. This
makes intuitive sense in many situations because it gets more difficult to maintain close
relationships with all members of the network as the network grows but can also be corrected
for by multiplying by the number of actors in the network. Closeness scores for Zachary’s
Karate club network are displayed in figure .
Eigenvector centrality:
Eigenvector centrality measures the degree to whichan actoris connectedto other well
connected actors. It takes advantage of a mathematical property of networks (represented as
adjacency matricies) that allows for the easy calculation of how well connected an actor is to
other well connected actors. While we will not get into the details of its calculation, this
measure captures the value of having a lot of friends in high places. Eigenvector scores for
Zachary’s Karate club network are displayed in figure .
Brokerage:
Brokrage describes the position of actors such that they occupy an advantageous position
where they can broker interactions between other actors in the network. Brokerage Centrality
is then a measure of the degree to which an actor occupies a brokerage position across all
pairs of alters. It is meant to capture the intuition that a broker serves as a go-between and
thus can gain benefits from their position as an intermediary. There are five kinds of
brokerage relationships, each of which we will discuss briefly below:
(a) A Coordinator is an Actor in the same group as two alters who connects the two nodes.
An example might be a graduate student who makes sure that all of the rest of their cohort is
made aware of parties being hosted by anyone in their cohort.
(b) An Itinerant broker is a member of an outside group that connects two others who share
group membership.
(c) A Gatekeeper is a member of the same group as the target a member of another group
hopes to connect with that can control whether or not that outside actor is able to gain access
to the in group member. An example might be a secretary or office manager.
(d) A Representative is a member of the same group as an Actor that wishes to connect with
an actor outside of the group but has to go through an intermediary. An example is an
Ambassador for a country.
(e) A Liason is a member of a group that is distance from two actors that wish to connect but
do not share group membership themselves. A delivery truck driver is a good example.
Reciprocity:
Reciprocity is the tendency for directed ties from actor i to actor j be be reciprocated and sent
back from actor j to actor i. This captures the classic finding that feeling and actions tend to
be reciprocated.
Transitivity:
Transitivity is the tendency for friends of friends to be friends and enemies of enemies to be
enemies. More generally a transitive relationship is one where two nodes being connected to
a third increases the likelihood that they will connect themselves (Hoff et al., 2002; Carpenter
et al., 2004).
Structural Equivalence:
Structural equivalence is a concept that describes actors occupying the same position in the
network relative to all other actors (Lorrain and White, 1971). In the example figure below,
each grey circle contains a set of actors that are structurally equivalent to all others. This
concept is important in making comparisons between nodes about their relative importance
and position in a network. Check out the following web resources for more information:
Robert A. Hanneman’s Page on Structural Equivalence, Tom Schnijders Lecture on
Structural Equivalence.
A Clique:
A Clique is a subset of actors in a network such that every two actors in the subset are
connected by a tie. This definition follows the common english language usage of the word
meaning a densely connected group. A large example clique is colored red in Figure.
A Star:
A star is a network structure where all ties connect to one central node, making the shape of
a star.
All of the properties discussed above refer to individual actors or subsets of actors in a
network. While these are important characteristics to measure, we can also think about
properties that a network as a whole exhibits. These properties are important because they
impost structure on the entire space of interactions and relationships and can have profound
aggregate effects on how actors in the network behave and function as a whole. 1.
Centralization (Degree, Betweenness, Closeness, Eigenvector, etc.): is a measure of the
unevenness of the centrality scores of actors in a network. It ranges from zero, when every
actor is just as central for whatever score we are interested in, to 1, when one node is
maximally central and all others are minimally central. This measure is a good way to express
the idea that there are couple of very powerful or important actors in a network or that
power/importance is spread out evenly in one simple measure.
Homophily:
Homophily is a process where actors who are similar on a particular trait are more likely to
form ties. This has been confirmed in over 100 empirical studies, with a few examples
including: (Ibarra, 1992; Straits, 1996; McPherson et al., 2001; Centola et al., 2007;
Goodreau et al., 2009; Kossinets and Watts, 2009; McDonald, 2011). This process is the
basis for the commonly used phrase “birds of a feather, flock together”. A classic example of
a sociological study of homophily (by race) is provided in Figure . The opposite of
homophily is Heterophily, which refers to a process whereby actors who are different from
each other are more likely to form ties. An example of heterophily may be that of formal
academic advising relationships, with students being more likely to form ties to faculty for
advising than to other students.
Modularity:
Modularity is a measure of the degree to which a network displays Community Structure,
with clusters that are not densely connected to others but densely connected within cluster.
This measure is very difficult to calculate, but provides a way to identify community structure
on a network where where one is unsure if such a structure exists (Newman, 2006; Zhang et
al., 2008; Karrer and Newman, 2011). However, this measure is not consistent across
networks of different size and group size. Graph Compartmentalization – a related measure –
does allow for comparison between networks of arbitrary size and structure, but is not
designed for detecting communities (Denny, 2014). An example of community structure
between authors of papers about network analysis is presented in Figure.
The Diameter:
The diameter of a network is defined as the longest of all the calculated shortest paths
between actors. Network diameter gives us an idea about how easily reachable Actors are on
a network. A very large diameter means that even though there is theoretically a way for ties
to connect any two actors through a series of intermediaries, there is no guarantee that they
actually will be connected. Diameter is thus a signal about the ability for information or
disease to diffuse on the network. The diameter of of Zachary’s Karate club network is
displayed graphically in Figure.
There are a number of classic Network Types that can be used to characterize the
stereotypical social structure in different situations. Regular networks are characterized by all
actors having the same degree and are often a starting point for simulation studies of
networks (Centola et al., 2007). Small world networks are very efficient for information
transfer in that most nodes are not connected (so a high degree of clustering) but also have a
relatively short average path length between actors (Travers and Milgram, 1969; Watts and
Strogatz, 1998). Random networks are very robust to disruptions (Latora and Marchiori,
2001; Callaway et al., 2000) but may be difficult for people to maintain, especially if ties are
across long distances (Dodds et al., 2003; Aral et al., 2012). Examples of network types
originally discussed in Watts and Strogatz (1998) are shown in Figure.
Social Network Analysis (SNA) has gained importance over the last two decades, both as a
research program and toolbox for network analytical applications in various settings. As a
research stream, SNA has established a new paradigm in the social and behavioral sciences
that focuses both conceptually and methodologically on relational characteristics of social
phenomena and behavioral patterns. It distinguishes itself from traditional scientific
approaches that typically analyze – at least methodologically – the different objects of
investigation as independent from each other. In practical applications of network analytical
tools and techniques, SNA has established itself as a useful approach to study the
interconnectivity of individual or collective actors in social processes such as communication
flows or decision-making situations.
Due to the practical usefulness of its rather easily accessible basic concepts and techniques,
SNA has also attracted an increasing interest from practitioners, either as contractors or
sponsors of network studies or as users of network analytical tools and applications. The
international conference on Applications of Social Network Analysis (ASNA) provides a
yearly forum for academics and practitioners that do research on SNA-related concepts and
methods and/or apply SNA techniques and tools in their academic or practical work. With a
shared interest in network analytical concepts and techniques, ASNA brings together scholars
and practitioners with different disciplinary backgrounds and fields of activity. Well-
positioned particularly in the social sciences (sociology, political science, communication and
media science), ASNA has also attracted growing interest among natural and computer
scientists.
Therefore, ASNA has made an important contribution to the promotion and advancement of
SNA-related research in Switzerland and neighboring countries over the last years. Today,
Social Network Analysis
ASNA is an internationally recognized conference with participants from all over the world.
Leading scholars have been enlisted as keynote speakers and workshop directors.
Question: Participants were asked whom of the other actors (focusing on the institution, not
on individuals) listed they have been in contact with in a professional context.
Main findings were low density of the network (7.2%), low clustering, and high
centralization /core-periphary structure. Investigation of modularity (more dense links among
group members than externally) of sectors and locations characterised ‘gastronomy’ and
‘activities’ as possible subgroups. High network density was found within each of the 3
locations compared to overall density indicating spatial integration, and strong sectoral
integration among most sectors except entertainment (lacking connections with 2 other
sectors). Concering individual roles, the ten most central actors (for each of 3 centrality
measures) are listed and discussed. Various sectors are represented in these lists, and this is
taken as an indication of resilience of the network, although the lack of integration of
cableway companies could be a potential risk (more related to preparation to longer-term
changes requiring social learning, diversification and innovation rather than to response to
immediate shocks). Network analysis is carried out in the framework of climate change
adaptation and resilience of social systems
Authors surveyed the majority (145 or 80%) of households in the community of Habu,
speaking either to the head or another available adult to collect household network and other
data. Network data focused on exchanges during stress periods relating to disease in
livestock, crops or humans. Centrality of households was calculated, to explore the idea that
high degree centrality and high betweeness centrality can enhance resilience of households.
They carried out data processing steps to produce clearer networks (removing isolates and
pendants) removing those with low betweeness values; also three actors with very highly
institutionalised roles (and almost 100% in-degree centrality) were removed for the analysis.
Compared to the other SNA studies reviewed this one has the largest dataset (which means
that network measures, and other statistics may be more reliable).
SNA measures of connectivity are evaluated for correlations with indicators of resilience
(household capitals, other attributes of households such as size, gender and age of head, and
other connectivity-related variables such as attendence and membership in community
organisaitons). These tests were run both with the overall network (which may be obtained by
collapsing the four types of relations into one) and ones constructed on specific types of
exchange (labour, food, money and information).
This is a good example of how to combine SNA with other statistical measurement and
testing for a very comprehensive analysis. However, concerning resilience, it should be noted
that by only getting the perspectives of the household head - authors don’t assess differences
in individual resilience among heads and other household members, which may be important.
Question: With which three other households do you exchange information, labour, food, or
money in times of stress?
Main findings are that male-headed households have significantly higher degree centrality
and betweeness centrality, as have households with older heads and with larger size, but that
ethnicity and educational level are not highly correlated with network centrality. Looking at
specific types of relations, labour and food networks were found to be denser than
information and money; all networks are are strongly correlated with measures of resilience
except the information network which is significantly correlated. The study found
differentials among households, and by investigating the profiles of particular weakly
connected/ strongly connected nodes provided further insight related with network position.
The authors suggest that high degree centrality can enhance resilience of households (by
providing redundancy / availability of alternatives) and by facilitating social learning.
However they also found that betweeness centrality was the measure that was most often
associated with resilience -- suggesting that indirect connectivity can also add to resilience
and social learning as it provides broader connectivity.
Authors surveyed household heads belonging to a forest community in the ejido Sierra
Morena, Chiapas, Mexico. Ejidos are a form of of common property, that provide a way of
controlling land access based on inheritence. The village is composed of ejidatario and
poblador households, with only the former able to own land, and the authors
surveyed all heads. This is a rare example of having highly complete network data, and the
authors mention very good researcher-interviewee relations there (based on previous long-
term field visits). The study is also characterised by a well-defined boundary.
Directional data were collected for five different labour-related networks, specific ones:
coffee, palm, ecotourism and authorities and one general network. It is not reported how
these specific types were suggested. Network centralization measures (categorical
core/periphery analysis) were calculated on the set of all relations, as was network hierarchy
(Krackhardt algorithm). Its sub-structures (Girvan-Newman clustering and factions) were
identified. The authors then used specific network information to assign each node to the
network in which it had highest indegree and reveal grouping activites in the all-relations
network. For each specific network statistical tests for correlations between degree-centrality
and group affiliation (tenure status and income category) were performed.
The second part of the paper concentrates in more detail on coffee networks and how they
have evolved over five timeperiods since 2000, and this becomes a very interesting analysis
combined with other fieldwork data, and the same statistical tests were made for coffee
networks as for specific networks. Methodology, there could be a problem with recall for past
historical data, and it is not clear how to cater for actors leaving or coming into the system
(the authors analysed the same set of nodes over 2000 to 2011) which might lead to
incomplete data. However, because of the high quality of the Sierra Morena data and access
to interview respondents for cross-validation, this can be assumed not to be a problem here.
Question: To whom do you relate/work with for different productive activities? Which coffee
groups did you belong to in the past (2000/2004/2008/2010)?
Authors found evidence for network centralisation and heirarchy in the all-relations network
and node assignment showed the presence of a separate ecotourism group/cluster (explained
by historical rivalries). Differences among networks, for example low transitivity of coffee
(2.2% - emphasising open network) and high outdegree centralization of palm (50% -
emphasising within-group bonding ties) showed the differences in organisation and were well
explained through examining the marketing and regulatory factors. In this way the paper
demonstrates that SNA can help understand embeddedness and power structures. Statistical
tests showed ejidatarios (group) having significantly higher indegree but not outdegree. Some
misleading terminology was used eg. ‘factors determiningthe position in the network’
‘income and tenure affected particularly the indegree’ when correlations relations were
found. In terms of consequences of the organisation of social networks, the environmental
sustainability of production and its resilience were aluded to rather than examined in depth.
A pre-study is used to produce a recall list - a complete list of all the actors involved. The
pre-study involves asking members of a group of informants with long experience to name
actors that could potentially influence management of the resource. In Stein et al. study
spokespersons of the organisations on the recall list were interviewed and asked to mark their
"regular/long-term" relations to other actors on the recall list, for each of the three types: i)
funding, ii) information and knowledge exchange, and iii) collaboration. The third type of tie
was only counted if reciprocal.
It was considered infeasible to interview/visit all actors involved in water management and
use, so 4 communities were selected. Then two criteria were used to determine actors'
inclusion/exclusion, ie. to draw the boundary of the SN. The first criterion was attributional,
the second criterion was a relational one - that they were mentioned more than twice by
respondents as actors to which they had relations. Hence respondents could define the
network boundaries - an approach known as 'expanding selection'. In fact only 2 new actors
met this second criteria.
For this paper, the authors only analysed collaborative (reciprocal) ties, investigating
centrality of actors as well as centralization of the overall network, density and sub-groups
(Newman-Girvan algorithm).
Question: What are your "regular/long-term" relations to other actors on the recall list, for
each of the three types: i)funding ii)information and knowledge exchange iii)collaboration.
Findings were presented at multiple scales of interaction and considered direct and indirect
actors separately and together. At the horizontal/local scale 'direct' actors (e.g CBOs) are
sparsely connected, however addition of the (indirect actors) village leadership improves this
picture. Subgroups were found to follow community boundaries with 4 clusters. At the
vertical scale, although actors with formal role in governance were found to be central
connectors overall, they were not well-connected to local, direct users. This highlights the
top-down nature of governance that fails to consider informal structures which may be able to
inform and coordinate improved governance.
Social relations play an important role in our life. In fact, we are defined in terms of our
contacts and relations. Co-authorship is one of the most important relations for academics.
Systematic analysis of this relation can help unravel hidden trends and interesting facts about
individuals and institutions. There is huge amount of co-authorship data from which
academic social networks can be extracted. The huge size and diversity of the data make it
imperative to design automatic methods for extraction and analysis of networks.
Social network analysis is important for get important information for large size of data and
utilize important information and produce useful result like recommendation and search,
spam detection etc.Social network analysis provides insights into social media that can help
individuals and organizations make informed decisions about online conversations.
REFERENCES
[1] “Social Network Analysis: Methods and Applications (Structural Analysis in the
Social Sciences)” by Stanley Wasserman and Katherine FaustN.
[3] Pooja Wadhwa, M.P.S. Bhatia, "An insight into properties of real world
networks", Advances in Computing Communications and Informatics (ICACCI)
2013 International Conference on, pp. 1930-1935, 2013.
[5] Bei Fan, Lu Liu, Ming Li, Yin Wu, "Knowledge Recommendation Based on
Social Network Theory", Advanced Management of Information for Globalized
Enterprises 2008. AMIGE 2008. IEEE Symposium on, pp. 1-3, 2008.
[10] T. Luthe, R Wyss, and M. Schuckert, 2012. Network governance and regional
resilience to climate change: empirical evidence from mountain tourism communities in the
Swiss Gotthard region, Regional Environmental Change 12(4) 839-854
DOI: 10.1007/s10113-012-0294-5
[11] Cassidy, L., and G. D. Barnes. 2012. Understanding household connectivity and
resilience in marginal rural communities through social network analysis in the village of
Habu, Botswana. Ecology and Society 17(4): 11. http://dx.doi.org/10.5751/ES-04963-170411
[13] C. Stein, H. Ernstson, J. Barron (2011) A social network approach to analyzing water
governance: The case of the Mkindo catchment, Tanzania, J. Phys. Chem. Earth (2011)
References 41