Team 3
Team 3
A PROJECT REPORT
Submitted by
IN
INFORMATION TECHNOLOGY
LOYOLA-ICAM
COLLEGE OF ENGINEERING AND TECHNOLOGY
CHENNAI-600034
BONAFIDE CERTIFICATE
Certified that this project “GIT VISUALIZER” is the bonafide work of DANIEL
THOMAS J(311120205020), VISHAL M(311120205061), MOHAMED
RAMEEZ N(311120205038) who carried out the project work under my
supervision.
SIGNATURE
SIGNATURE
First of all, we are grateful to God for granting us this opportunity and capability
to proceed successfully. Working on this project has been a rewarding experience.
We would like to express our sincere thanks and gratitude to our Director, Rev. Dr
Maria Wenisch SJ, the Dean of Students Rev. Dr Justine SJ; We thank them for
making the right resources available at the right time.
We also would like to thank the project coordinator Dr. A Janani, M.S., Ph.D., the
faculty and technical staff of our department for their critical advice and guidance which
was helpful in completion of the project.
Git is the most commonly used version control system. Git tracks the changes you
make to files, so you have a record of what has been done, and you can revert to
specific versions should you ever need to. Git also makes collaboration easier,
Lack of proper tool to efficiently analyze and visualize Git commits to properly
backtrack and debug complex projects with enormous commits. We propose a web
application which requests for the user’s github repository URL. When provided, it
renders the visualization of the Git history in an organized tree like structure.
The Web App offers features like Searching for a commit or a branch, customizing
1 BULL DIAGRAM 14
2 OCTOPUS DIAGRAM 15
3 DFD 0 16
4 DFD 1 17
5 DFD 2 18
7 CLASS DIAGRAM 20
8 SEQUENCE DIAGRAM 21
9 COLLABORATION DIAGRAM 22
11 ACTIVITY DIAGRAM 24
12 COMPONENT DIAGRAM 25
13 DEPLOYMENT DIAGRAM 26
14 PACKAGE DIAGRAM 27
15 SYSTEM ARCHITECTURE 32
TABLE OF CONTENTS
ABSTRACT iv
LIST OF FIGURES v
LIST OF ABBREVIATIONS vi
1 INTRODUCTION 10
1.1 Introduction to Domain 10
1.2 Overview of the Project 10
1.3 Purpose of the Project 10
1.4 Project Plan 11
1.5 Scope of the Project 11
1.6 Summary 12
2 LITERATURE SURVEY 13
2.1 Literature Survey 13
2.2 Summary 13
3 SYSTEM ANALYSIS AND DESIGN 14
3.1 Problem Definition 14
3.4.3 DFD-2 18
3.5 UML Diagram 19
4 SYSTEM REQUIREMENTS 29
SAMPLE CODE 37
REFERENCE 45
CHAPTER 1
1. INTRODUCTION
This project deals with visualizing the history and changes of Git repositories. Git is a popular version
control system used by software developers to manage code changes, collaborate with other developers,
and track project progress. The project's domain involves understanding Git concepts such as commits,
branches, and file changes, as well as how to parse and process Git log files. It also involves
understanding web development concepts such as creating an API, transforming and filtering data, and
using D3.js for data visualization.The project's target audience is software developers and anyone who
needs to understand the history and changes of a Git repository. This may include project managers,
stakeholders, or anyone involved in the development process.
The Git repository visualization tool is a web application that allows users to visualize the history and
changes of a Git repository. The tool consists of three main components: data processing, web API, and
visualization.The data processing component is responsible for parsing and processing the Git log file. It
extracts information such as commit messages, authors, dates, and file changes, and performs data
cleaning and transformation as needed. The output of this component is a JSON data file that contains
the relevant information needed for visualization.The web API component provides access to the JSON
data file. It exposes endpoints that allow users to retrieve the data based on their needs, such as
filtering.The visualization component uses D3.js, a powerful data visualization library, to render
interactive and intuitive visualizations of the Git repository data.
10
1.4 PROJECT PLAN
The project planning stage starts with the requirement analysis stage where the git repository will be
analyzed which will be followed by a project overview in which the total scope will be
determined.The back-end design will be done in parallel with the API and wireframe design. The
design of the front-end will be at the near end of the project cycle which will be followed by
integration and deployment.
1.5 SCOPE OF THE PROJECT
The scope of your project is to create a visualization tool for Git repositories that allows users to explore
the commit history and file changes of the repository. The tool should take as input a Git log file,
process it in the back-end and visualize it in the front-end.
Git log parser - a back-end component that parses the Git log file and extracts information about
commits, files, and changes.
Data transformation and filtering - a back-end component that transforms the parsed data into a JSON
format that can be used for visualization and filters the data based on user input.
11
Web API - a back-end component that provides a web API for the front-end to interact with and retrieve
the transformed data.
Visualization component - a front-end component that uses D3.js to create interactive visualizations of
the Git repository data.
1.6 SUMMARY
The Git repository visualization project aims to provide an interactive and intuitive way for users to
explore the history and changes of a Git repository. The project includes the development of a data
processing tool, web API, and front-end visualization using D3.js.
12
CHAPTER 2
CodeFlower: An Interactive Visualization of Code Repositories" by Peter Van Dijck: This paper
describes CodeFlower, a tool that visualizes code repositories as interactive flowers. The tool uses
color-coded petals to represent different files and branches, and allows users to navigate and explore the
repository.
Gitana: Interactive Mining and Visualization of Git Repositories" by Nicolas Bettenburg, Sergii Shkliar,
and Alberto Bacchelli: This paper describes Gitana, a tool for mining and visualizing Git repositories.
The tool allows users to explore the repository's history, visualize code changes, and identify patterns
and trends.
GitDiver: Interactive Differencing and Merging of Git Repositories" by John O'Donovan, James
Hennessey, and Brian Mac Namee: This paper describes GitDiver, a tool for interactively diffing and
merging Git repositories. The tool provides a visual interface for comparing and merging code changes,
and allows users to identify conflicts and resolve them.
Gitcharts: A Visualization Tool for Git Repository Commits" by Jitendra Singh and D. Durga Bhavani:
This paper describes Gitcharts, a tool that visualizes Git repository commits as charts. The tool provides
an overview of the repository's history, and allows users to drill down into individual commits to view
details and associated files.
2.2 SUMMARY
The literature survey focused on existing tools and techniques for visualizing Git repositories. Several
papers were identified that described tools for visualizing repository history, code changes, and patterns
and trends. Some of these tools used interactive visualizations, such as flowers, charts, timelines, and
trees, to enable users to explore the data and identify important information.
13
CHAPTER 3
14
3.2 NEED ANALYSIS
3.2.1 BULL DIAGRAM
15
3.3 FUNCTIONAL ANALYSIS
3.3.1 OCTOPUS DIAGRAM```
Principal Functions
PF1: Render the output as website to user
PF2: Allow user to view log until a particular time
Constraint Functions
CF1: To be user friendly
CF2: To create an interactive UI
CF3: To allow customization of rendered output
3.4.1 DFD-0
17
3.4.2 DFD-1
3.4.3 DFD-2
18
3.5 UML DIAGRAM
3.5.1 USE CASE DIAGRAM
Figure 6
A use case diagram is a representation of a user's interaction with the system that shows the relationship between the user and the
different use cases in which the user is involved.
19
3.5.2 CLASS DIAGRAM
Figure 7
A class diagram is a type of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among objects.
20
3.5.3 SEQUENCE DIAGRAM
Figure 8
A sequence diagram shows object interactions arranged in time sequence. It depicts the sequence of messages exchanged
between the objects to carry out the functionality of the scenario.
21
3.5.4 COLLABORATION DIAGRAM
Figure 9
A Collaboration diagram models the interactions between objects or parts in terms of sequenced messages.
22
3.5.5 STATE CHART DIAGRAM
Figure 10
A state diagram shows the behaviour of classes in response to external stimuli. Specifically, it describes the
behaviour of a single object in response to a series of events in a system.
23
3.5.6 ACTIVITY DIAGRAM
Figure 11
Activity diagrams are graphical representations of workflows of stepwise activities and actions with support for choice,
iteration and concurrency.
24
3.5.7 COMPONENT DIAGRAM
Figure 12
A component diagram depicts how components are wired together to form larger components or software systems.
They are used to illustrate the structure of arbitrarily complex systems.
25
3.5.8 DEPLOYMENT DIAGRAM
Figure 13
A deployment diagram models the physical deployment of artifacts on nodes. It shows what hardware
components exist, what software components run on each node, and how the different pieces are connected.
26
3.5.9 PACKAGE DIAGRAM
Figure 14
3.6 SUMMARY
All the UML diagrams for the system are drawn displaying various objects, functions and classes
of the system.
27
CHAPTER 4
SYSTEM REQUIREMENTS
Software Requirements
Version Control System: A Git repository should be available for the project so that the tool can access
and process the data.
Backend Framework: The project requires a backend framework to process the data and create the
JSON output that is passed to the frontend.
Database: A database is required to store and retrieve the processed data. A relational database like
PostgreSQL, MySQL or SQLite could be used.
Frontend Framework: The project requires a frontend framework to create the visualizations. Some
possible frameworks include D3.js and React.
Text Editor or IDE: Developers require a text editor or IDE to edit the source code.
Git Client: A Git client like GitKraken or GitHub Desktop can be used to clone, checkout and switch
branches.
Operating System: The project can be developed and deployed on any operating system, but the choice
depends on the developer's preference and familiarity. Some possible operating systems include
Windows, MacOS, and Linux.
Web Server: A web server is required to serve the frontend and backend parts of the application. Some
popular web servers include Apache and Ngnix.
Requirements:
Import Git repository data: The tool should be able to import data from Git repositories, including commit
history, file changes, and branch information.
Visualize repository history: The tool should provide a visualization of the repository's history, showing the
branching and merging of code, and allowing users to drill down to view individual commits.
Visualize code changes: The tool should enable users to view changes to specific files over time,
28
highlighting additions, deletions, and modifications.
Identify patterns and trends: The tool should enable users to identify patterns and trends in the repository's
history, such as the frequency of commits, the most active contributors, and the most common types of
changes.
Filter and sort data: The tool should allow users to filter and sort data based on various criteria, such as
time period, contributor, file type, or keyword.
Export data: The tool should enable users to export the visualizations and underlying data in a variety of
formats, such as JSON or CSV.
User management: The tool should provide user management functionality, such as authentication and
authorization, to ensure that only authorized users can access the repository data.
Responsive design: The tool should have a responsive design that is optimized for desktop and mobile
devices, allowing users to access the visualizations from anywhere.
29
4.2 NON FUNCTIONAL REQUIREMENTS
Performance: The tool should be able to handle large amounts of data and provide visualizations in a
timely manner, even for complex repositories with many files and branches.
Usability: The tool should be intuitive and user-friendly, with clear instructions and help documentation
to guide users through the visualization process.
Accessibility: The tool should be accessible to users with disabilities, including those who use screen
readers or other assistive technologies.
Security: The tool should use industry-standard security protocols, such as HTTPS and SSL, to protect
user data and prevent unauthorized access.
Compatibility: The tool should be compatible with a range of operating systems, browsers, and
devices, to ensure that users can access the visualizations from anywhere.
Scalability: The tool should be scalable, allowing it to grow and adapt to changing user needs and
evolving technologies.
Reliability: The tool should be reliable and stable, with minimal downtime or errors, to ensure that
users can access the visualizations when they need them.
Customizability: The tool should be customizable, allowing users to tailor the visualizations to their
specific needs and preferences.
30
CHAPTER 5
SYSTEM IMPLEMENTATION
31
The system consists of three main components:
Git Repository: This component represents the Git repository that the tool will analyze and visualize. It
contains all of the code, commits, and other data that the tool will use to generate the visualizations.
Backend: This component represents the backend of the system, which processes the data from the Git
repository and generates the JSON output that is passed to the frontend. It includes a web server, a
database, and a backend framework (e.g., Django, Flask, or Express.js) to handle data processing and
management.
Frontend: This component represents the frontend of the system, which generates the visualizations that
users will see. It includes a frontend framework (e.g., D3.js, React) to handle data visualization and user
interface design.
The components communicate with each other through an API, which allows the frontend to request
data from the backend and receive JSON responses. The web server serves both the backend and
frontend components to users, and the database stores processed data for quick retrieval.
Overall, this system architecture is designed to provide a scalable, reliable, and customizable tool for
visualizing Git repositories.
Code reviewers: Code reviewers could use the tool to quickly identify changes made to code files over
time, as well as the authors of those changes. This would help them identify potential issues or conflicts
in the code and provide more targeted feedback to developers.
Project managers: Project managers could use the tool to gain insights into the development process,
such as how many commits were made during a given time period or which files were modified the
most. This information could help them better allocate resources, set realistic timelines, and identify
potential risks.
32
Developers: Developers could use the tool to gain a deeper understanding of the code they are working
on, such as which files are most frequently modified or which branches have the most changes. This
would help them make more informed decisions about how to approach their work and prioritize their
tasks.
Technical writers: Technical writers could use the tool to gain a better understanding of the
documentation associated with a project. By analyzing the commit history and file changes, they could
identify when documentation was added, updated, or deleted, and ensure that the documentation stays
up-to-date with the code changes.
Researchers: Researchers could use the tool to analyze code repositories and gain insights into software
development practices, such as how frequently developers commit changes or which files are most
commonly modified. This information could be used to improve software development processes or
inform academic research.
5.3 SCREENSHOTS
33
5.4 SUMMARY
The proposed application is to provide a turn-by-turn directions to the users who are not familiar
with the interiors of the building, this saves time and energy of the user.
34
CHAPTER 6
6.1 CONCLUSION
The Git repository visualization project aims to provide a user-friendly and informative way to
visualize the commit history and file changes in a Git repository. The system consists of three main
components: the Git repository, the backend, and the frontend. The Git repository provides the necessary
data, which is then processed by the backend component and stored in a database. The frontend
component generates interactive visualizations based on the retrieved data.
Overall, the Git repository visualization project provides a useful tool for developers and project
managers to better understand the history and evolution of their Git repositories. With the ability to
visualize commit history and file changes, developers can more easily track progress, identify trends,
and make informed decisions about the development of their projects.
Integration with more Git hosting platforms: Currently, the system only supports local Git
repositories, but it could be enhanced to support Git hosting platforms such as GitHub, GitLab, or
Bitbucket.
Improved file diff visualization: The current visualization shows file changes over time, but it
could be enhanced to provide a more detailed view of changes made to specific lines of code.
Support for more file types: Currently, the system only supports text-based files, but it could be
enhanced to support other file types such as images, videos, and audio files.
Real-time updates: The system could be enhanced to provide real-time updates as new commits
are made to the repository, instead of relying on periodic updates.
Collaboration features: The system could be enhanced to support collaboration features such as
commenting on commits, suggesting changes, and assigning tasks to team members
35
SAMPLE CODE
UI DESIGN
import { Avatar, Box, ChakraProvider, HStack } from "@chakra-ui/react";
import {
Center,
Image,
VStack,
Text,
InputGroup,
Input,
InputLeftElement,
Button,
Spinner,
} from "@chakra-ui/react";
import { LinkIcon } from "@chakra-ui/icons";
import logo from "./github_icon.png";
import { useState } from "react";
import RenderPage from "./RenderPage";
import Login from "./Login";
import ScrollToBottomButton from "./ScrollToBottomButton";
fetch("http://localhost:8080/log?url=" + target.value, {
method: "GET",
headers: {
Accept: "application/json",
},
})
.then((response) => response.json())
.then((res) => {
setData(res);
console.dir(res);
});
// setTimeout(()=>setData([]), 2000)
}
}
36
return onClick;
};
function App() {
const [value, setValue] = useState("");
const [hideHome, setHome] = useState(false);
const [data, setData] = useState(undefined);
return (
<ChakraProvider>
{!value ? (
<Login value={value} setValue={setValue} />
):(
<>
<HStack
position="absolute"
right="25px"
top="25px"
border="1px solid gray"
borderRadius="25px"
padding="10px"
>
<Avatar
name={localStorage.getItem("name")}
size="sm"
src="https://bit.ly/broken-link"
/>
<Text>{localStorage.getItem("name")}</Text>
</HStack>
<ScrollToBottomButton />
<Center h="100vh" w="100vw" hidden={hideHome}>
<VStack spacing="3">
<Image boxSize="100px" src={logo} />
<Text width="100%" fontSize="large">
Enter the GitHub URL
</Text>
<InputGroup width="300px">
<InputLeftElement
pointerEvents="none"
children={<LinkIcon color="gray.300" />}
/>
<Input id="url" />
</InputGroup>
<Button
colorScheme="teal"
variant="outline"
alignSelf="start"
onClick={onClickFactory(setHome, setData)}
>
Go
</Button>
</VStack>
</Center>
<Box hidden={!hideHome}>
{!data ? (
37
<Center h="100vh">
<Spinner
thickness="4px"
speed="0.65s"
emptyColor="gray.200"
color="blue.500"
size="xl"
/>
</Center>
):(
<RenderPage data={data.reverse()}></RenderPage>
)}
</Box>
</>
)}
</ChakraProvider>
);
}
GRAPH RENDER
import * as d3 from "d3";
import { useEffect, useRef } from "react";
let data;
const GitGraph = ({ datum }) => {
const svgRef = useRef(null);
useEffect(() => {
data = data && datum;
}, []);
useEffect(() => {
const margin = { top: 0, right: 20, bottom: 30, left: 40 };
const width = 800 - margin.left - margin.right;
const height = 600 - margin.top - margin.bottom;
const svg = d3
.select(svgRef.current)
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
const simulation = d3
.forceSimulation(data)
.force(
"link",
38
d3
.forceLink()
.id((d) => d.commit_id)
.distance(60)
)
.force("charge", d3.forceManyBody().strength(-150))
.force("center", d3.forceCenter(width / 2, height / 2));
function getNodeById(id) {
return data.find((node) => node.commit_id === id);
}
function drag(simulation) {
function dragstarted(event) {
39
if (!event.active) simulation.alphaTarget(0.3).restart();
event.subject.fx = event.subject.x;
event.subject.fy = event.subject.y;
}
function dragged(event) {
event.subject.fx = event.x;
event.subject.fy = event.y;
}
function dragended(event) {
if (!event.active) simulation.alphaTarget(0);
event.subject.fx = null;
event.subject.fy = null;
}
return d3
.drag()
.on("start", dragstarted)
.on("drag", dragged)
.on("end", dragended);
}
}, [svgRef]);
// console.log(data);
return <svg width="800" height="600" ref={svgRef}></svg>;
};
LOG PARSING
package main
import (
"bufio"
"bytes"
"strings"
"encoding/json"
"fmt"
"net/http"
"os"
"os/exec"
"path"
"github.com/gin-contrib/cors"
"github.com/gin-gonic/gin"
)
func main() {
router := gin.Default()
router.Use(cors.Default())
router.GET("/log", getLogs)
router.Run(":8080")
// url := "https://github.com/lppedd/idea-conventional-commit"
// jsonStr := retrieveLogFromRepo((url))
}
fmt.Println("Download success...")
fmt.Println("Changing working directory")
fmt.Println("Extracting log")
return jsonLog
}
parsedLog := []map[string]string{}
tmpData := make(map[string]string)
buffer := []byte{}
scanner := bufio.NewScanner(bytes.NewReader(logP))
newLine := []byte("\n")
currentIndex := -1
// i:=0
for scanner.Scan() {
line := scanner.Bytes()
// if i<50 {
// fmt.Printf("%v\n",string(line))
// i++
// }
if !bytes.Equal(line, byteKeys[currentIndex+1]) {
buffer = append(buffer, line...)
buffer = append(buffer, newLine...)
continue
}
currentIndex += 1
if currentIndex == 0 {
continue
}
if currentIndex == 7 {
parsedLog = append(parsedLog, tmpData)
42
tmpData = make(map[string]string)
currentIndex = 0
}
}
tmpData["body"] = string(buffer)
parsedLog = append(parsedLog, tmpData)
return string(jsonStr)
}
43
REFERENCE
[1] "GitStory: A Visualization Tool for Git Repositories" by A. Wahid, S. Saeed, and S. Saeed, in the
Proceedings of the 9th International Conference on Computer and Automation Engineering (ICCAE 2017).
[2] "Visualizing Git: A Study of Commits and Branching Strategies" by R. Oechsle and L. Schmieder, in the
Proceedings of the 18th International Conference on Software Engineering and Knowledge Engineering (SEKE
2006).
[3] "Visualizing Git Repositories Using PyGit2 and D3.js" by J. Boner, in the Journal of Open Source
[4] "GitVisual: A Web-Based Visualization Tool for Git Repositories" by D. Talbot, A. Shtukaturov, and C.
Bird, in the Proceedings of the 10th IEEE Working Conference on Mining Software Repositories (MSR 2013).
44