Group File
Group File
Bachelor of Technology
in
Computer Science & Engineering
Name of the Student(s): Name of the Supervisor:
Mahesh Sharma Mrs. Nandini Sharma
2005110100061(CSE) Assistant Professor (CSE)
Gargi Teotia
2005110100036(CSE)
Anjali Gautam
2105111529002(CSE-AI)
Mukesh
2005110100070(CSE)
Session: 2023-2024
i
Declaration
I hereby certify that the work which is being presented in B.Tech. Project Report entitled
“Chat Lens”, as partial fulfillment of the requirement for the degree of Bachelor of
is an authentic record of our own work carried out during a period from 13/05/2024 to
25/05/2024 under the supervision of Mrs. Nandini Sharma, Assistant Professor in the
The matter presented in this project report in full or part, has not been submitted by us for
the award of any other degree elsewhere and is free from plagiarism.
Gargi Teotia
2005110100036
Mahesh Sharma
2005110100061
Anjali Gautam
2105111529002
Mukesh
2005110100070
ii
Certificate
This is to certify that the Project report entitled “Chat Lens” done by Mahesh
Institutions, Mathura under my guidance. The matter embodied in this project work has
not been submitted earlier for the award of any degree or diploma to the best of my
Place: Mathura
Signature Signature
iii
Acknowledgement
The merciful guidance bestowed to us by the almighty made us stick out this project to
a successful end. We humbly pray with sincere heart for his guidance to continue
forever.
We pay thanks to our project guide Mrs. Nandini Sharma who has given guidance
and light to us during this project. Her versatile knowledge has helped us in the critical
times during the span of this project.
We pay special thanks to our Head of Department Dr. Ramakant Baghel who has
been always present as a support and help us in all possible way during this project.
We also take this opportunity to express our gratitude to all those people who have
been directly and indirectly with us during the completion of the project.
We want to thank our friends who have always encouraged us during this project.
At the last but not least thanks to all the faculty of CSE and CSE AI department who
provided valuable suggestions during the period of project.
iv
Abstract
Chat Lens is a web-based service, which collects and analyzes chat histories of the
mobile messaging application WhatsApp and it can also analyze bank data to get
insights of that data. It leverages the e-mail export feature of WhatsApp to obtain the
chat histories, which cannot be accessed otherwise due to encrypted storage on the
Thus, the major asset of the service is that real communication data can be collected
The collected communication data can be analyzed and provides valuable insights into
the communication in WhatsApp and the resulting network traffic. To incentivize users
communication data.
Moreover, it provides valuable insights in various patterns such as: Which jobs types of
customer are likely to subscribe term deposit? Are Single or Married people more
likely to subscribe term deposit? Does education has any effect on subscription of term
deposit?
v
TABLE OF CONTENT
Declaration............................................................................................................(ii)
Certificate.............................................................................................................(iii)
Acknowledgement................................................................................................(iv)
Abstract..................................................................................................................(v)
Table of Content...................................................................................................(vi)
List of Figures....................................................................................................(viii)
Chapter 1. Introduction………………………………………………… 1
1.1 Preliminaries............................................................................. 1
1.2 Motivation ….…………......................................................... 2
1.3 Problem Statement……………………...…............................ 3
1.4 Aim and Objectives ………………………………….……… 4
Chapter 4. Implementation…................................................................... 16
4.1 Introduction ………………………….…………………….... 16
4.2 Implementation Strategy ……………………………………. 16
4.3 Tools/Hardware/Software Requirements..………...………… 32
vi
4.4 Expected Outcome (Performance metrics with details) …...... 33
5.1. Result………………………….……………………………... 40
5.2 Discussion………………………….………………………... 41
Chapter 6. Conclusion & Future Scope.……………............................ 42
6.1 Conclusion…………………………………………………… 42
6.2 Future Scope ………………………………………………… 42
References 45
vii
LIST OF FIGURES
Chapter 1
INTRODUCTION
1.1 PRELIMINARIES
One of the most generally utilized informing applications overall is
WhatsApp. Group chats have become an essential tool for
communication, with people using them for personal, educational, and
business purposes. The amount of data generated from these group chats
and the data produced in a bank can be overwhelming, making it difficult
to extract meaningful insights and patterns. To overcome this challenge,
we have created “Chat Lens” that use data processing to extract valuable
information from these conversations and data sheets. These tools can
provide insights on topics discussed, frequently used keywords, which
person is likely to subscribe to term deposit or less likely to do so and the
sentiment of messages exchanged. The Chat Lens can be useful in
various domains, such as education, business, and social settings. In
education, instructors can analyze student group chats to identify topics
of interest and monitor engagement. In business, managers can analyze
group chats to identify areas of improvement and evaluate team
communication. In social settings, individuals can use the tool to analyze
their chat history and gain insights into their communication patterns.
This paper presents a comprehensive overview of the Chat Lens and its
applications. It provides an in-depth analysis of WhatsApp chat and Bank
Data used to extract insights and the challenges associated with analyzing
WhatsApp group chats and data related to a bank’s employee.
ix
1.2 MOTIVATION
The development of a Chat Lens stems from a profound motivation
rooted in the recognition of the significance of communication and data
in both personal and professional spheres. Understanding communication
patterns, sentiments, and collaboration dynamics is crucial for personal
development, team efficiency, and overall well-being. The motivations
behind this project can be categorized as follows:
The author aim to develop a complete interface where users have the
option to select whether to analyze WhatsApp chat or whether to analyze
the bank data. Upon selecting between the two, the user can upload the
WhatsApp chat in text format by exporting the chat from WhatsApp or
the bank’s csv file. In case when user selects ‘social media chat
analyzer’ , it will provide users with two options to study the chats. On
submitting the chat, the engine will display the complete report with
interactive graphs, which is easy to understand. In case the user selects
xi
‘Bank Data Analysis’ the user can get an in-depth idea of how many
people (educated or not, married or single ,etc) are more likely or less
likely to subscribe to term deposit. The report we want to display will
include the following analysis from the chat we need to showcase.
1. Top Statistics
2. Activity timelines and Maps
3. Word Cloud
4. Most Common words
5. Emoji Analysis
6. Sentiment Analysis
7. Does education has any effect on subscription of term deposit?
8. Are Single or Married people more likely to subscribe term deposit?
9. Which jobs types of customer are likely to subscribe term deposit?
1.4.1 AIM
The primary aim of Chat Lens is to empower users with a sophisticated
tool that not only extracts and organizes chat data and
employee’s data but goes further to offer meaningful insights
into their communication and financial habits. By leveraging data
analytics and natural language processing, the analyzer seeks to
provide users with a deeper understanding of the emotional
nuances, engagement dynamics, and prevalent topics within their
conversations.
1.4.2 OBJECTIVE
In this decade the upcoming technologies are mainly dependent on
data. This data can only be obtained if there is some research
applied on the context of the requirements of the tool. Since a lot of
machine learning enthusiasts develop models which helps solve
multiple problems the requirements of appropriate data are very
large scale this project aims to provide a better understanding
towards various types of chats. This analysis proves to be better
input to machine learning models which essentially explore the chat
xii
data. These models require proper learning instances which
provides better accuracy for these models .Our project ensures to
provide an in-depth exploratory data analysis on various types of
chats.
Chapter 2
LITERATURE SURVEY
2.1 INTRODUCTION
A survey analysis on WhatsApp Chat Exploratory Data Analysis [1] has
been conducted and various studies and analysis have been found. These
studies include WhatsApp has been the most used mode of
communication and has been an efficient one too. It consists of many
conversations in groups and individuals. So, there might be some hidden
facts in them. This project takes those chats and provide a deep analysis
of that data. Being any topic, the chats are it provide the analysis in an
efficient and accurate way.
Another survey on WhatsApp Chat Analyzer [2] said the most used and
efficient method of communication in recent times is an application
called WhatsApp. WhatsApp chats consists of various kinds of
conversations held among group of people. This chat consists of various
topics. This information can provide lots of data for latest technologies
such as machine learning.
Chinthapanti Bharath Sai Reddy, along with others, in his research paper
Analysing and Predicting the Emotion of WhatsApp Chats [3] said that
everyone has the curiosity of what other person thinks about the other
while having a conversation, judging the other person can’t be done
perfectly, So this paper is providing a way using sentiment analysis
between conversation
.While chatting with other person we always have a question about our
image on the other persons mind. This process deals with preprocessing
the data obtained from the WhatsApp chat which is exported to a server
and then sentiment analysis is applied for each message and all of the
messages’ sentiment is normalized from a proposed method and overall
sentiment is found out.
xv
Chapter 3
PROPOSED METHODOLOGY
xvi
3.2 SYSTEM ANALYSIS AND DESIGN
xvii
i
3.2.2.2 ACTIVITY DIAGRAM
In the activity diagram as the initial activity starts user
will upload the file as input which is action and in the
next action time format will be selected.
The decision box check chat format represents the
validity of the time format of the file.
If the time format is correct then analysis will be done
and process will end.
If the time format is wrong user will have to again
check for the correct format.
xix
Fig. 3.2.2 Activity Diagram(Social Media Chat)
Chapter 4
IMPLEMENTATION
4.1 INTRODUCTION
This project is a social media chat analyzer built with Python and Streamlit.
The application provides various analyses on a chat log, including top
statistics, activity timelines, activity maps, word cloud, most common
words, emoji analysis, and sentiment analysis. The analysis can be done for
a specific user or for the overall chat.
Steps: xxi
4.2.1.1. User Initiates Chat Upload:
The user interacts with the front-end interface to upload a
chat file or csv file accordingly.
xxii
4.2.1.8. Front-end Displays Results:
The front-end receives the analysis results from the server
and displays them to the user on the interface.
4.2.2.1. States:
File Upload: Initial state where the user uploads the chat file.
Select Time Format: State where the user selects the
desired time format for analysis.
Analysis: State where the analysis of the chat data is performed.
Display Overall Result: State where the overall analysis
result is displayed on the user interface.
Select User for Specific Analysis: State where the user
xxii
i
selects whose analysis they want to see.
Display User-Specific Result: State where the analysis
result for the selected user is displayed on the user interface.
4.2.2.2. Transitions:
4.2.4 ALGORITHM
Steps:
1. Initialize Regular Expression Patterns:
A pattern to identify timestamps and delimiters in the chat log.
xxv
2. Split the Data:
Use the timestamp pattern to split the data into individual
messages, excluding the very first split which does not contain a
message.
Find all instances of timestamps using the same pattern to
create a list of dates.
6. Update DataFrame:
Add new columns for user and message to the DataFrame, and
remove the original User_message column.
7. Display Plots:
Utilize Streamlit's st.pyplot() function to render each plot in
the web application, allowing users to visually analyze the
messaging activity over time, both on a monthly and daily
basis.
3. Preprocess Messages:
Convert all messages in the DataFrame to lowercase and strip
leading and trailing whitespaces. This standardization helps in
accurately counting word frequencies by treating the same
words in different cases as identical.
6. Plot bar charts to analyze the relationship between 'deposit' and other
categorical variables:
'job'
'marital'
'education'
xxx
vi
Fig.4.7 Emoji Analysis
xxx
vii
Fig.4.10 Weekly Activity Map
xxx
viii
Fig.4.12 Sentiment Analysis
xxx
ix
Fig.4.14 Analysis 1
Fig.4.15 Analysis 2
xl
Fig.4.16 Analysis 3
xli
Chapter 5
RESULT AND DISCUSSION
The Chat Lens developed as part of this project successfully enables users to
upload chat files, analyze chat data based on selected time formats, perform
sentiment analysis, and visualize results. The system allows users to view
overall analysis results and also drill down to see individual user-specific
analysis.
Discussion:
User Experience: The system enhances user experience by providing a
user- friendly interface for uploading chat files, selecting analysis
options, and visualizing results. Interactive features such as user-
specific analysis empower users to customize their analysis
experience.
Insights for Communication Analysis: The chat analyzer offers
valuable insights for communication analysis in various contexts, such
as group projects, team collaborations, or social interactions,
subscription to deposit or not. Understanding communication patterns,
sentiment trends, and individual contributions can inform decision-
making and improve communication strategies.
Potential Applications: Beyond personal use, the chat or data analyzer
can find applications in academic projects, research studies, and
organizational analyses. For instance, in a college project report, the
analyzer can be used to analyze group chat conversations among
project members to assess communication effectiveness, identify key
topics of discussion, and evaluate team dynamics.
xlii
Chapter 6
CONCLUSION AND FUTURE WORK
6.1 CONCLUSION
In conclusion, it can be said that the capabilities of the WhatsApp
application or data collected in bank and the power of the python
programming language in implementing whatever data analysis intended,
cannot be overemphasized. This work was to discuss the WhatsApp
application and python libraries, to create an analysis of a WhatsApp chat
.We propose to employ dataset manipulation techniques to have a better
understanding of WhatsApp chat present in our phones .It shows most
used emoji and which word was repeated most times. It tracks our
conversation and analyzes how much time we are spending .The system
was done with python, and the python libraries that were implemented
which includes, NumPy, Pandas, Matplotlib and Seaborn. At the end of the
work expected results were obtained and the analysis was able to show the
level of participation of the various individuals on the given group chat .
On serious note this system has the ability to analyze any WhatsApp chat
or data in the csv file into it.
xliv
6.2.9. Cross-Platform Compatibility:
Ensure cross-platform compatibility to support analysis of chat
data from various messaging platforms beyond WhatsApp. This
would make the analyzer more versatile and adaptable to
different communication ecosystems.
REFERENCES
1. Anurag Kumar Singh , Rishabh Bhatia, Dr. Praveena Akki ,”WhatsApp
Chat Exploratory Data Analysis” Volume 11 Issue V May 2023-
Available at www.ijraset.com.
xlv
2. Ravishankara K , Dhanush , Vaisakh , Srajan I S,” WhatsApp Chat
Analyzer” Volume 09, Issue 05 (May 2020).
xlvi