0% found this document useful (0 votes)
22 views19 pages

Project Report

This document describes a project report on analyzing a co-authorship network. It was submitted by 4 students to partial fulfill their Bachelor of Engineering degree in Computer Science at Malnad College of Engineering, under the guidance of Dr. Aruna Kumar. The project aims to analyze co-authorship networks from published papers using social network analysis tools and techniques in order to identify patterns and important authors.

Uploaded by

chandana y m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views19 pages

Project Report

This document describes a project report on analyzing a co-authorship network. It was submitted by 4 students to partial fulfill their Bachelor of Engineering degree in Computer Science at Malnad College of Engineering, under the guidance of Dr. Aruna Kumar. The project aims to analyze co-authorship networks from published papers using social network analysis tools and techniques in order to identify patterns and important authors.

Uploaded by

chandana y m
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

Jnana Sangama, Belagavi90018

A PROJECT REPORT

ON

“Co-Authorship Network Analysis”


Submitted in partial fulfillment of
the requirements for the award of the degree of

Bachelor of Engineering
in
Computer Science and Engineering

Submitted by

Arun 4MC18CS037
Kumar 4MC18CS046
RRRRRR 4MC18CS047
SSSSSS 4MC18CS050

Under the guidance of


Dr. Aruna Kumar S V
Associate Professor

Department of Computer Science and Engineering


Malnad College of Engineering
Hassan - 573201, Karnataka, India
2021-2022
Malnad College of Engineering
Department of Information Science and Engineering
Hassan - 573201, Karnataka, India

Certificate

This is to certify that project work entitled “Co-Authorship Network Analysis”


is a bonafide work carried out by in partial fulfillment for the award of Bachelor

Ra 4MC18CS037
QQQQQQ 4MC18CS046
RRRRRR 4MC18CS047
SSSSSS 4MC18CS050

of Engineering in Information Science and Engineering of the Visvesvaraya Techno-


logical University, Belgavi during the year 2021-2022. It is certified that all correc-
tions/suggestions indicated for Internal Assessment have been incorporated in the
report deposited in the departmental library. The project report has been approved
as it satisfies the academic requirements in respect of project work prescribed for the
Bachelor of Engineering Degree.

Signature of the Guide Signature of the HOD Signature of the Principal


Dr. Aruna Kumar S V Dr. Geetha Kiran A Dr. C V Venkatesh
Associate Professor Prof. & HOD Principal
Dept. of CSE, MCE Dept. of CSE, MCE MCE

Examiners
Name of the Examiner Signature of the Examiner

1.

2.
ABSTRACT
The effectiveness of scientific groups i.e. innovation, productivity depend on the
relationships of its people. Social Network Analysis (SNA) is a diagnostic method
for collecting and analysing data about the patterns of relationships among people in
groups. The study of social network analysis is based on network theory.
Co-authorship can be conceptualized as collaboration between two or more au-
thors, these collaboration and communication among authors give rise to Co-authorship
networks. In Co-authorship network the authors are represented as nodes and rela-
tionship between two authors is defined as a line where they have co-authored one or
more papers together. This representation allows researchers to apply graph theory to
the analysis of what would otherwise be considered an inherently elusive and poorly
understood problem: the tangled web of our social interactions. The structure of such
networks turns out to reveal many interesting features of academic communities.
Our study is motivated by social and technological factors. On social side we want
better discovery of the people like identifying who is the central figure in the network,
understanding connections and communities. On the technological side we want to
explore the methods and tools available for studying and visualizing these networks,
and to improve these algorithms and techniques.
In this project, we look in detail at particular networks of scientific collaborations
obtained from the netscience data and sample data from the research publications
and describe some of the patterns they reveal. These facts captured can be helpful to
affirm interesting facts.
Overall goal of our project work is the analysis of Co-authorship networks using
Social network analysis tools and techniques.
ACKNOWLEDGEMENTS

Salutations to our beloved and highly esteemed institute “Malnad College Of Engi-
neering” for having well qualified staff and labs furnished with necessary equipment.
The successful completion of any task would be incomplete without mentioning
the people who have constantly guided and inspired for the project and the report
generation.
We avail this opportunity to thank all those people who have helped us in the
whole process of completion of this project and the report.
We would like to express our deepest gratitude to our mentor Dr. Aruna Kumar
S V, Department of Computer Science & Engineering for her daily evaluation of the
work and for providing us constant encouragement with her unflinching support and
valuable guidance throughout this project. We are indebted to his as he has considered
us worthy of this project and invested his faith in us. It was a great privilege to work
under him. We will take away a lot more than just the technical knowledge from him.

We would also like to place our special thanks to Dr. C V Venkatesh, Principal,

Malnad College Of Engineering for providing the opportunity and the facilities for

the completion of our project.

PPPPP

QQQQQ

RRRRR

SSSSS
TABLE OF CONTENTS

1 Introduction 1

1.1 Introduction to Social Network Analysis . . . . . . . . . 1

1.1.1 Co-authorship Network . . . . . . . . . . . . . . . 1

1.2 About Project . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Problem Statement . . . . . . . . . . . . . . . . . 2

1.2.2 Objective . . . . . . . . . . . . . . . . . . . . . . 2

2 Literature Survey 4

2.1 Co-authorship Network . . . . . . . . . . . . . . . . . . . 4

2.2 Centrality Measures . . . . . . . . . . . . . . . . . . . . . 4

3 Project Design 6

3.1 High Level Design . . . . . . . . . . . . . . . . . . . . . . 6

4 Implementation 7

5 Results 8

5.1 Comparison of Community Detection Algorithms . . . . 8

6 Conclusion 9

A Java Program 10

i
References 11

ii
LIST OF FIGURES

1.1 A sample Co-authorship network . . . . . . . . . . . . . 2

3.1 High Level Design Overview of Co-authorship Network

Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5.1 Community Size Variation . . . . . . . . . . . . . . . . . 8

iii
LIST OF TABLES

5.1 Comparison of Community Detection Algorithms . . . . 8

iv
Chapter 1

Introduction

1.1 Introduction to Social Network Analysis


Social Network Analysis (SNA) is a diagnostic method for collecting
and analyzing data about the patterns of relationship among people
in groups. The study of social network analysis is based on network
theory.
Social network analysis is the methodical analysis of social networks.
Social network analysis views social relationships in terms of network
theory, consisting of nodes (representing individual actors within the
network) and links which represent relationship between individuals
like friendship, kinship. These networks are often depicted in a social
network diagram, where nodes are represented as points and links are
represented as lines.

1.1.1 Co-authorship Network

Co-authorship is conceptualized as collaboration between two or more


authors, this collaboration and communication among authors gives
rise to Co-authorship networks. An example of a paper co-authorship
network is shown in 1.1. As shown in Fig. 1.1, author A and author
B are represented by node A and node B. When a paper is written by
author A, author B, and author C, these co-authorship relations among
authors are represented by undirected links between node A and node
B, between node A and node C, and between node B and node C. By
representing co-authorship relations as a network, it becomes possible

1
to analyze the structure of co-authorship relations.

Figure 1.1: A sample Co-authorship network

1.2 About Project

1.2.1 Problem Statement

The Netscience dataset which represents a co-authorship network of


scientists working on network theory and experiment is being analyzed
to reveal the hidden patterns. We also analyze the live sample publi-
cation data. The online sample publication data we have considered
is of Balaraman Ravindran, Associate Professor at the Department of
Computer Science and Engineering at the Indian Institute of Technol-
ogy Madras. These are analyzed using social network analysis tools
and techniques to extract the hidden patterns. The facts captured can
be used by various funding agencies and, to award the most deserving
person among the co-authors of the papers.

1.2.2 Objective

The main objective of our project is the analysis of co-authorship net-


work data. We, as an initial step, derive the exact author names from
the publication list and then the social network analysis tools Gephi
and NodeXL are used to perform the analysis. Firstly, centrality mea-
sure analysis is to be carried out on the Netscience data and then the
sample publication data to note the most influential author using the
different centrality metrics. Secondly, different community detection

2
algorithms are applied on both the data sets in order to detect the
communities. Finally, the different community detection algorithms
are compared based on certain metrics.

3
Chapter 2

Literature Survey

A social network is a set of people or groups of people with some pat-


terns of contacts or interactions between them [1]. These interactions
can be of friendship, business, scientific collaborations, or media among
others. Formal analysis of these networks is called social network anal-
ysis.

2.1 Co-authorship Network


Co-authorship is a formal manifestation of intellectual collaboration
in scientific research. Ideally, it represents the participation of two or
more authors in the production of a publishable study [2]. In the first
half of the twentieth century, scientific papers written by more than
one author were rare. In recent decades, there is a growing trend for
co-authorship in scientific publication. The trend has attracted much
attention from researchers and there are many studies exploring the
incidence and causes of the phenomenon.

2.2 Centrality Measures


Eigenvector centrality is a measure of the influence has a node in a net-
work. It assigns relative scores to all nodes in the network based on the
well-known principle that connections to high-scoring nodes contribute
more to the score of the node in the question than equal connections to
low-scoring nodes [3]. In general, connections to people who are them-
selves influential will lend a person more influence than connections to

4
less influential people.
There is an ample of research going on in the field of social network
analysis and especially in the field of co-authorship network. Many
types of analysis are being done on different networks of different sizes.
In this project we try and do the centrality analysis on the Netscience
data and also on the publication data [4] by extracting author names
from the publication list. This extraction process can be further be
extended to large networks and further analyzed.

5
Chapter 3

Project Design

The purpose of the design is to plan the solution of a problem specified


by the requirements documents. This phase is the first step moving
from problem to the solution domain. In other words, starting with
what is needed design takes us to work how to satisfy the needs the
design of the system is perhaps the most critical factor affecting the
quality of the output and has a major impact on the later phases.
System design aims to identify the modules that should be in the
system, their functions and interactions with each other to produce
desired results. This chapter presents the High Level Design and a
brief description of the modules.

3.1 High Level Design


The block diagram shown in Fig. 3.1 is the high level design architecture
of the co-authorship network analysis.

Figure 3.1: High Level Design Overview of Co-authorship Network Analysis

6
Chapter 4

Implementation

The implementation of extracting of the authornames from publica-


tion data is carried using Java language. The java program is give in
appendix A. The processe of implementation is as follows:

• The process begins by reading the line from publication file.

• The data until the year is extracted. This extracted data is matched
with the name pattern.

• Once the names of the authors are fetched the permutation is car-
ried out.

7
Chapter 5

Results

5.1 Comparison of Community Detection Algorithms


The table 5.1 gives the summary of the various entities compared for
the different community detection algorithms.
Method Size Internal Density
Spectral Clustering 410 1.99
Modularity maximization 405 2.50

Table 5.1: Comparison of Community Detection Algorithms

As observed in the table 5.1, the Spectral clustering algorithm gen-


erates the largest number of communities. The internal density is max-
imum for the Modularity Maximization algorithm.
Fig. 5.1 gives the community size trend using the CNM algorithm.
From the graph it is evident that the community size decreases as the
community number increases. Thus the community size is highest for
the initial community number and it shows drastic dip with the second
community number and decreases gradually later on.

Figure 5.1: Community Size Variation

8
Chapter 6

Conclusion

In this project, we performed two of the most important tasks in social


network analysis. The results of our study can be used to enhance
research collaborations. In the initial phase, we analysed the Netscience
data and Balaraman Ravindran publication data . Our findings are
summarised as follows:

• The structural analysis of social networks is an effective way of


finding essential groups and authors.

• The tools GEPHI and NodeXL, open source softwares are used for
visualization of the network.

9
APPENDIX A

Java Program

The following is the code used to extract the author names and write
the result into a file.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.regex.*;
import java.io.BufferedWriter;
import java.io.*;
public class dup
{
public static void main(String[] args)
{
BufferedReader br = null;
try {
BufferedWriter bw = new BufferedWriter(new FileWriter("N.txt"));

10
References

[1] John Scott and Peter J Carrington. The SAGE handbook of social network analysis.
SAGE publications, 2011.

[2] Acedo Francisco Jose, Carmen Barroso, Cristobal Casanueva, and José Luis Galán.
Co-authorship in management and organizational studies: An empirical and network
analysis. Journal of Management Studies, 43(5):957–983, June 2006.

[3] Amir Noori. On the relation between centrality measures and consensus algorithms.
In IEEE International Conference on High Performance Computing and Simulation
(HPCS), pages 225–232, Istanbul, July 2011.

[4] Balaraman Ravindran Department of Computer Science & Engineering. Iit madras.
http://www.cse.iitm.ac.in/~ravi/publications.html, 2013. [Online Accessed
April-2013].

11

You might also like