0% found this document useful (0 votes)

17 views14 pages

NLP Report

Natural language processing project report with code

Uploaded by

sanjanabhosle27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views14 pages

NLP Report

Natural language processing project report with code

Uploaded by

sanjanabhosle27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Excelssior Education Society’s

K.C. College Of Engineering and Management Studies and Research

(Affiliated to the University of Mumbai)
Mith Bunder Road, Near Hume Pipe, Kopri, Thane (E)-400603

Text Summarization using NLP

Submitted in partial fulfillment of the requirements of the degree

BACHELOR OF ENGINEERING IN COMPUTER
ENGINEERING

Aneesh Panchal (B-07)

Gaurang Rajam (B-10)
Sachin Satam (B-15)

Supervisor
Prof. Mahesh Maurya

Department of Computer Engineering

K.C. College of Engineering and Management Studies and
Research
Mith Bunder Road, Kopri, Thane (E)-400603

University of Mumbai
(BE 2023-24)
CERTIFICATE

This is to certify that the Report entitled “Text Summarization using NLP” is a bonafide

work of Aneesh Vinod Panchal (Roll No:07), Gaurang Rajam (Roll No:10), Sachin Satam

(Roll No:15) submitted to the University of Mumbai in partial fulfillment of the requirement

for the award of the degree of “Bachelor of Engineering” in “Computer Engineering”.

Supervisor

Mahesh Maurya

Head of The Department Principal

Mahesh Maurya Vilas Nitnaware
Report Approval
This Report entitled “Text Summarization using NLP” is a bonafide work of Aneesh Vinod Panchal

(Roll No:07), Gaurang Rajam (Roll No:10), Sachin Satam (Roll No:15) is approved for the degree

of “ Bachelor of Engineering” in “Computer Engineering”.

Examiners

1………………………………………
(Internal Examiner Name & Sign)

2…………………………………………
(External Examiner name & Sign)

Date:

Place: Thane
Contents

Abstract 5

Acknowledgments 6

List of Abbreviation 7

List of Figure 7

List of table 7

1 Introduction 8-9
1.1 Introduction
1.2 Motivation
1.3 Problem Statement

2 Literature Survey 10-11

2.1 Survey to Existing Project

2.2 Limitation to Existing System
2.3 Mini Project Contribution

3 Proposed Techniques 12-19

3.1 Introduction
3.2 Algorithm
3.3 Details of Software and Hardware requirement
3.4 Experiments and Results
3.5 Conclusion and Future Work

4 References 20
Abstract

Text Summarization is the process of creating a condensed form of text document which maintains significant information
and general meaning of source text. Automatic text summarization becomes an important way of finding relevant
information precisely in large text in a short time with little efforts. Text summarization approaches are classified into two
categories: extractive and abstractive. This paper presents the comprehensive survey of both the approaches in text
summarization. the challenge of how to make computer understand the document with any extension and how to
make it generate the summary is the main motivation. Reducing the time and effort of the user of reading through
entire document to know what the document is about is also the driving force behind this work. To summarize
large documents of the text will be difficult for human beings. Extractive and abstractive summarization is two
types of summarization. An extractive summarization method is concatenating important sentences or paragraphs
without understanding the meaning of those sentences. An abstractive summarization method is generating the
meaningful summary. The system uses is a culmination of both statistical and linguistic analysis of text document.
Summary generated is better than mere statistical summarizers that generate summary based on word frequency
calculation. Addition of plural resolution and abbreviation resolution adds more precision to summary. Concept of
normalization introduced here makes sentences get their weights purely based on value of its content words and
not on number of words it has. Therefore even a small but important sentence gets its place based on values of
words it has. Adding linguistic features to the algorithm fine tunes the summary to higher level.
Acknowledgement

No project is ever complete without the guidance of those experts who have already traded their past before
and hence became Master of It and as a result, our mentor. So, we would like to take this opportunity to take
all those individuals who have helped us in visualizing this project. The guidance of “Keerti Kharatmol”
played a great role in our research work. His guidance helped us in finding relevant information about our
topic. We are grateful to get an opportunity to present our work to everyone. We would like to express our
gratitude to the ‘K.C. College of Engineering and Management studies & Research’’ as well as our Head of
Department professor “Mahesh Maurya” for promoting students to express their ideas and research. Our
sincere vote of thanks goes to our college Principal, "Dr. Vilas Nitnaware'' for believing in the work of their
students and pushing our limits to do better in our field of study.

List of Abbreviation

NLTK – Natural Language Tool Kit

OS - Operating System
RAM - Random Access memory
API - Application Programming interface

List of Tables

Table 2.1 Literature Survey

List of Symbol

Fig 3.2.1 Workflow of the program

Fig 3.3.1.1 Main Program
Fig 3.3.1.2 Test Program
Fig 3.3.2.1 Program Output
Fig 3.3.2.2 Program Output 2
1. Introduction
1.1 Introduction

To reduce length, complexity, and retaining some of the essential qualities of the original document, will go for
summarizer. Titles, key words, tables-of-contents and abstracts might all be considered as the forms of summary.
In a full text document, abstract of that document plays role as a summary of that particular document. They are
intermediates between document‟s titles and its full text that is useful for rapid relevance and quick assessment of
the document. Autosummarization is a technique generates a summary of any document, provides briefs of big
documents, etc. There is an abundance of text material available on the internet. However, usually the Internet
provides more information than is needed. It is very difficult for human beings to manually summarize large
documents of text. Therefore, a twofold problem is encountered. Searching for relevant documents through an
overwhelming number of documents available, and absorbing a large quantity of relevant information. The goal of
automatic text

Microsoft Word‟s AutoSummarize function is a simple example of text summarization. Text Summarization
methods can be classified into extractive and abstractive summarization. An extractive summarization [1] method
consists of selecting important sentences, paragraphs etc. from the original document and concatenating them into
shorter form. The importance of sentences is decided based on statistical and linguistic features of sentences. An
Abstractive summarization [2] attempts to develop an understanding of the main concepts in a document and then
express those concepts in clear natural language. It uses linguistic methods [3] to examine and interpret the text
and then to find the new concepts and expressions to best describe it by generating a new shorter text that conveys
the most important information from the original text document

1.2 Motivation

Every machine learning pipeline is a set of operations, which are executed to produce a model. An ML model is roughly
defined as a mathematical representation of a real-world process. We might think of the ML model as a function that takes
some input data and produces an output (classification, sentiment, recommendation, or clusters). The performance of each
model is evaluated by using evaluation metrics, such as precision & recall, or accuracy.

1.3 Problem Statement

To implement text summarizer algorithm using Machine Learning Libraries.
2. Literature Survey

Author AKukkar [9] introduced one effective approach to produce a flexible and productive bug report summary as well as to
minimize load and work of developer. Author used particle swarm optimization (PSO) approach for searching the effective
semantic text. Author tried to address four central points that are, extractive bug report summarization, to increase the
ROUGE score by selecting effective semantic text, sparsity of data and reduction of information. Proposed methodology used
collection of comments and some feature extraction methods to generate or produce the bug report summary. Multiple
summary subsets are produced and optimal summary subset evaluated by PSO optimization technique. Author compared the
proposed approach with existing Email Classifier (EC) and Bug Report Classifier (BRC). ROUGE score was selected as one
of the evaluation criteria and was calculated for all approaches. At the same time, the ROUGE score was compared with three
human-generated summaries of 10 bug reports of Rastkar dataset. As a result, PSO approach summary subset was less
redundant, and included all important points needed to be present in a bug report.

Author Beibei Huai and team [10] gave new Intention-based bug report summarization approach, alias IBRS which is based
on intention taxonomy. This work considered sentences intentions in order to generate summary report. Sentence intentions
were classified according to their taxonomy levels into seven categories: bug Description, fix solution, opinion expressed,
information seeking, information giving, meta/code and emotion expressed. Now, sentences are categorized in specific
intention with the help of pattern matching and machine learning model. Finally bug report summary is produced. This
summary was compared with BRC (Bug Report Classifier) and found better in terms of precision (5% improved), recall (3%
improved), F-score (3% improved) and pyramid precision (5% improved)

Creating summary is selecting important topics of sentences as well as recognizing relevant relationships among those
concepts which are mentioned in that text. The key problem is generalization which is identified by ATS task. Stating an
example: summarization financial or medical reports are conceptually different from summarizing news articles. To solve the
above issue, to achieve more relevant summary, this paper, author Hernández-Castañeda, Ángel [11] proposes (EATS).EATS
is based on clustering technique which holds by GA i.e. Genetic Algorithm to get relevant topics in the proposed document.
To identify key sentences in the clusters, this method includes Topic Modeling Algorithm (LDA) which is based on keywords
those are generated automatically. This clustering technique needs LDA and Doc2Vec to map text to numeric vectors along
with tf-idf and n-grams. This method is tested on DUC02 dataset to achieve the goal of producing summaries as close to as
human generated summaries
3.Proposed System
3.1 Introduction

As proposed earlier, we have used NLTK, Numpy and Networks library under Machine Learning.

A. NLTK :

The Natural Language Toolkit (NLTK) is a platform used for building Python programs that work with human
language data for applying in statistical natural language processing (NLP). It contains text processing
libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning. It also includes
graphical demonstrations and sample data sets as well as accompanied by a cook book and a book which
explains the principles behind the underlying language processing tasks that NLTK supports. The Natural
Language Toolkit is an open source library for the Python programming language originally written by Steven
Bird, Edward Loper and Ewan Klein for use in development and education. It comes with a hands-on guide
that introduces topics in computational linguistics as well as programming fundamentals for Python which
makes it suitable for linguists who have no deep knowledge in programming, engineers and researchers that
need to delve into computational linguistics, students and educators. NLTK includes more than 50 corpora and
lexical sources such as the Penn Treebank Corpus, Open Multilingual Wordnet, Problem Report Corpus, and
Lin’s Dependency Thesaurus. Natural Language Processing with Python provides a practical introduction to
programming for language processing. Written by the creators of NLTK, it guides the reader through the
fundamentals of writing Python programs, working with corpora, categorizing text, analyzing linguistic
structure, and more. The online version of the book has been been updated for Python 3 and NLTK 3.

B. Numpy :

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a
multidimensional array object, various derived objects (such as masked arrays and matrices), and an
assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation,
sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random
simulation and much more. At the core of the NumPy package, is the ndarray object. This encapsulates n-
dimensional arrays of homogeneous data types, with many operations being performed in compiled code for
performance. There are several important differences between NumPy arrays and the standard Python
sequences:
C. Networks :

NetworkX is a Python language software package for the creation, manipulation, and study of the structure,
dynamics, and function of complex networks. It is used to study large complex networks represented in form
of graphs with nodes and edges. Using networkx we can load and store complex networks. We can generate
many types of random and classic networks, analyze network structure, build network models, design new
network algorithms and draw networks. NetworkX is appropriate for the procedure on enormous certifiable
charts: e.g., diagrams of more than 20 billion nodes and 200 billion edges.[clarification needed] Due to its
reliance on an unadulterated Python "word reference of word reference" information structure, NetworkX is a
sensibly effective, entirely versatile, profoundly compact system for organization and informal organization
examination.

3.4 Details of Hardware and Software

Software Requirements: -
1. Operating system : Windows 10 or any browser compatible OS
2. Web Browser
3.Any mobile phone with compatible android/ios version

Hardware Requirements :
1,All the hardware required to connect internet
for e.g. Modem, WAN/LAN, Ethernet Cable.
2. Storage : Size of the web browser.
3. RAM : 4GB
4. Processor : Intel Core i3
5. Any browser compatible mobile phone with internet.
3.4 Experiments and Results

Program

import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from nltk.cluster.util import cosine_distance
import numpy as np
import networkx as nx

# Define a function to read and preprocess the text

def read_and_preprocess_text(text_file):
with open(text_file, 'r') as file:
text = file.read()

# Tokenize the text into sentences

sentences = sent_tokenize(text)

# Tokenize sentences into words and remove stopwords

# nltk.download('stopwords')
stop_words = set(stopwords.words("english"))
word_tokens = [word_tokenize(sentence.lower()) for sentence in sentences]
word_tokens = [[word for word in words if word.isalnum() and word not in stop_words]
for words in word_tokens]

return sentences, word_tokens

# Define a function to calculate sentence similarity using cosine distance

def sentence_similarity(sent1, sent2):
vector1 = np.zeros((len(unique_words)), dtype=float)
vector2 = np.zeros((len(unique_words)), dtype=float)

for word in sent1:

vector1[word_index[word]] += 1

for word in sent2:

vector2[word_index[word]] += 1

return 1 - cosine_distance(vector1, vector2)

# Define the main function for text summarization

def generate_summary(text_file, num_sentences=5):
sentences, word_tokens = read_and_preprocess_text(text_file)
# Create a list of unique words in the document
global unique_words, word_index
unique_words = list(set(word for sentence in word_tokens for word in sentence))
word_index = {word: index for index, word in enumerate(unique_words)}

# Create a sentence similarity matrix

sentence_similarity_matrix = np.zeros((len(sentences), len(sentences)))

for i in range(len(sentences)):
for j in range(len(sentences)):
if i != j:
sentence_similarity_matrix[i][j] = sentence_similarity(word_tokens[i],
word_tokens[j])

# Create a graph from the similarity matrix

sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_matrix)

# Generate ranked sentences using PageRank

scores = nx.pagerank(sentence_similarity_graph)

# Sort sentences by their scores

ranked_sentences = sorted(((scores[i], sentence) for i, sentence in enumerate(sentences)),
reverse=True)

# Select the top 'num_sentences' sentences as the summary

summary = [sentence for score, sentence in ranked_sentences[:num_sentences]]

return "\n".join(summary)

# Example usage
if __name__ == "__main__":
input_text_file = "/content/input_text5.txt"
num_summary_sentences = 5

summary = generate_summary(input_text_file,
num_sentences=num_summary_sentences)
print("Lets summarize the given file.....")
print("\n")
print(summary)
print("\n")
Input Text :

As the G20 concluded on Sunday and Prime Minister Narendra Modi handed over the presidency to Brazil,
there was recognition of the efforts India made to arrive at a consensus on a joint communique. The theme
for India’s G20 presidency was Vasudhaiva Kutambakam — One Earth, One Family, One Family. In his
speeches over the years, Prime Minister Narendra Modi has spoken about India taking on a leadership role
in global affairs as “vishwa guru”, given its population and scale of economy, and this was on display in the
last two days. In his opening remarks at the Summit, the PM said, At the place where we are gathered today,
just a few kilometres away from here, stands a pillar that is nearly two-and-a-half thousand years old.
Inscribed on this pillar in the Prakrit language are the words: ‘Hevam loksa hitmukhe ti, atha iyam natisu
hevam’. Meaning, the welfare and happiness of humanity should always be ensured. Two-and-a-half
thousand years ago, the land of India gave this message to the entire world. Let us begin this G20 Summit
by remembering this message.
Output Text :
Two-and-a-half thousand years ago, the land of India gave this message to the entire world. As the G20
concluded on Sunday and Prime Minister Narendra Modi handed over the presidency to Brazil, there was
recognition of the efforts India made to arrive at a consensus on a joint communique. In his speeches over
the years, Prime Minister Narendra Modi has spoken about India taking on a leadership role in global affairs
as “vishwa guru”, given its population and scale of economy, and this was on display in the last two days. In
his opening remarks at the Summit, the PM said, At the place where we are gathered today, just a few
kilometres away from here, stands a pillar that is nearly two-and-a-half thousand years old. Let us begin this
G20 Summit by remembering this message.

3.5 Conclusion and Future Work

3.5.1 Conclusion

A text summarizer is a valuable tool that helps condense lengthy or complex documents into concise and coherent
summaries. It can save time and effort for readers who need to quickly grasp the main points of a text, making it
especially useful in fields like journalism, research, and education. While text summarizers have their advantages,
it's important to remember that they are not infallible and may not always capture the nuances of a text. Therefore,
human judgment and editing are often necessary to ensure the accuracy and clarity of the summary. As technology
continues to advance, text summarizers are likely to become even more sophisticated and play an increasingly
important role in information processing and knowledge dissemination.
3.5.2 Future Work

1. Text summarization algorithms will continue to advance, leading to greater accuracy in capturing the essence of
a text. Machine learning models, such as transformer-based models like GPT-4, are likely to further enhance
summarization capabilities.Shuffling between dark mode and light mode will be provided

2. Future text summarizers will become more proficient in summarizing content in multiple languages, breaking
down language barriers and enabling more accessible information sharing globally.

3. Text summarization will integrate with visual data, creating summaries that include images, charts, and graphs.
This will be especially useful for summarizing data-rich documents.

4 References
• https://iopscience.iop.org/article/10.1088/1742-6596/2040/1/012044/pdf
• https://www.ijert.org/research/text-summarizer-using-abstractive-and-extractive-method-
IJERTV3IS050821.pdf
• https://journals.sagepub.com/home/thrhttps://www.freecode camp.org/learn/scientific-computing-
with-python
• https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9623462

Gacovski Z Ed Soft Computing and Machine Learning With Pytho
No ratings yet
Gacovski Z Ed Soft Computing and Machine Learning With Pytho
380 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
115 pages
Text Summarizing Using NLP
100% (1)
Text Summarizing Using NLP
8 pages
DC Aiml
100% (1)
DC Aiml
210 pages
Text Summarisation and Document Understanding Report
No ratings yet
Text Summarisation and Document Understanding Report
50 pages
Research Paper 7
No ratings yet
Research Paper 7
8 pages
Text Summarization
No ratings yet
Text Summarization
76 pages
Lect NLP 20
No ratings yet
Lect NLP 20
31 pages
Automatic Text Summarization Using Natural Language Processing
No ratings yet
Automatic Text Summarization Using Natural Language Processing
54 pages
Distributed-Computing Book
No ratings yet
Distributed-Computing Book
149 pages
Deep Learning Powered Text Summarization Framework For Creating A Highly Accurate Summary
No ratings yet
Deep Learning Powered Text Summarization Framework For Creating A Highly Accurate Summary
19 pages
Extractive Arabic Text Summarization-Graph-Based Approach
No ratings yet
Extractive Arabic Text Summarization-Graph-Based Approach
17 pages
Automatic Text Summarization Using Natural Language Processing PDF
No ratings yet
Automatic Text Summarization Using Natural Language Processing PDF
54 pages
Seminar - Report - PYLI - RAGHURAM - Entire Document Ready
No ratings yet
Seminar - Report - PYLI - RAGHURAM - Entire Document Ready
26 pages
Abstractive Text Summarization Using Deep Learning
No ratings yet
Abstractive Text Summarization Using Deep Learning
43 pages
Implementation of NLP Based Automatic Text Summarization Using Spacy
No ratings yet
Implementation of NLP Based Automatic Text Summarization Using Spacy
15 pages
For MP
No ratings yet
For MP
13 pages
Project File
No ratings yet
Project File
23 pages
Final Year Report Submitted
No ratings yet
Final Year Report Submitted
61 pages
State of The Art Text - Summarisation
No ratings yet
State of The Art Text - Summarisation
15 pages
Synopsis Creation For Research Paper Using Text Summarization Models
No ratings yet
Synopsis Creation For Research Paper Using Text Summarization Models
5 pages
Viswajothi Technologies PR Ivate Limited: "Text Summarization Based On NLP"
67% (3)
Viswajothi Technologies PR Ivate Limited: "Text Summarization Based On NLP"
23 pages
An Extractive Approach For English Text
No ratings yet
An Extractive Approach For English Text
11 pages
IEEE Conference Template 3
No ratings yet
IEEE Conference Template 3
4 pages
Research Paper 8
No ratings yet
Research Paper 8
4 pages
Analysis of Abstractive and Extractive Summarizati
No ratings yet
Analysis of Abstractive and Extractive Summarizati
11 pages
An Overview of Extractive Based Automati
No ratings yet
An Overview of Extractive Based Automati
12 pages
Dynagraph Card Analysis
100% (2)
Dynagraph Card Analysis
4 pages
IEEE Conference Template 3 PDF
No ratings yet
IEEE Conference Template 3 PDF
4 pages
Abstractive Text Summarization Using Transformer Based Approach
No ratings yet
Abstractive Text Summarization Using Transformer Based Approach
10 pages
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
No ratings yet
Extractive Text Summarization: Motilal Nehru National Institute of Technology Allahabad
29 pages
Moawad 2012
No ratings yet
Moawad 2012
7 pages
Text Summarization Using Natural Language Processing
No ratings yet
Text Summarization Using Natural Language Processing
5 pages
Technical Seminar Report-6607
No ratings yet
Technical Seminar Report-6607
11 pages
Shubh Am
No ratings yet
Shubh Am
40 pages
Final Year
No ratings yet
Final Year
31 pages
Water Quality Analysis Final
No ratings yet
Water Quality Analysis Final
48 pages
Ir Case Study
No ratings yet
Ir Case Study
8 pages
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
No ratings yet
Text Summarizer Using NLP (Natural Language Processing) : © JUL 2022 - IRE Journals - Volume 6 Issue 1 - ISSN: 2456-8880
6 pages
Abstractive Text Summarization Using Transformer Architecture
No ratings yet
Abstractive Text Summarization Using Transformer Architecture
5 pages
Lab Manual DAR
No ratings yet
Lab Manual DAR
81 pages
Research Final
No ratings yet
Research Final
6 pages
Text Summarizer
No ratings yet
Text Summarizer
9 pages
Abstrating Wisdom: Text Summarization in The Age of Intelligence
No ratings yet
Abstrating Wisdom: Text Summarization in The Age of Intelligence
8 pages
Automatic Text Recognisation
No ratings yet
Automatic Text Recognisation
4 pages
A Survey of Advances in Text Summarization Methods
No ratings yet
A Survey of Advances in Text Summarization Methods
5 pages
Abstractive Survey
No ratings yet
Abstractive Survey
8 pages
ASWIN TS Summarisation of NLP Simplified Notes Unit 3
No ratings yet
ASWIN TS Summarisation of NLP Simplified Notes Unit 3
4 pages
NLP Miniproject
No ratings yet
NLP Miniproject
8 pages
IEEE Conference Template 1 PDF
No ratings yet
IEEE Conference Template 1 PDF
3 pages
Irsw Project
No ratings yet
Irsw Project
8 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
Research Paper Summer Izer
No ratings yet
Research Paper Summer Izer
6 pages
03 Decision Tree
No ratings yet
03 Decision Tree
59 pages
A.V.C. College of Engineering: Mayiladuthurai, Mannampandal-609 305
No ratings yet
A.V.C. College of Engineering: Mayiladuthurai, Mannampandal-609 305
21 pages
Automatic Text Summarization Using Python
No ratings yet
Automatic Text Summarization Using Python
8 pages
Faster R-CNN
No ratings yet
Faster R-CNN
20 pages
Malayalam 2
No ratings yet
Malayalam 2
4 pages
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
No ratings yet
Rane, Govilkar - 2019 - Recent Trends in Deep Learning Based Abstractive Text Summarization-Annotated
8 pages
Google Earth Engine Cloud Computing Platform For Remote Sensing Big Data Applications: A Comprehensive Review
No ratings yet
Google Earth Engine Cloud Computing Platform For Remote Sensing Big Data Applications: A Comprehensive Review
26 pages
Unit-2: Logistic Regression
No ratings yet
Unit-2: Logistic Regression
30 pages
Kim Gold
No ratings yet
Kim Gold
28 pages
LXMLS 2020 Martins Lecture PDF
No ratings yet
LXMLS 2020 Martins Lecture PDF
197 pages
ATSSI Abstractive Text Summarization Using Sentiment Infusion
No ratings yet
ATSSI Abstractive Text Summarization Using Sentiment Infusion
7 pages
Text Summarization Using Python NLTK
No ratings yet
Text Summarization Using Python NLTK
8 pages
Text Summarization Using NLP
No ratings yet
Text Summarization Using NLP
6 pages
Machine Learning 2 Books in 1 The Complete Guide For Beginners To Master Neural Networks Artificial Intelligence and Data Science With Python Park Download
No ratings yet
Machine Learning 2 Books in 1 The Complete Guide For Beginners To Master Neural Networks Artificial Intelligence and Data Science With Python Park Download
89 pages
Data Mining and Business Intelligence
50% (2)
Data Mining and Business Intelligence
2 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Grammer Correction Tool: Krutika Bhide A-03 Sanjana Bhosle A-09 Gaurav Sharma B-18
No ratings yet
Grammer Correction Tool: Krutika Bhide A-03 Sanjana Bhosle A-09 Gaurav Sharma B-18
15 pages
Top 170 Machine Learning Interview Questions 2024 - Great Learning
No ratings yet
Top 170 Machine Learning Interview Questions 2024 - Great Learning
67 pages
The Data Explosion: Modern Computer Systems Are Accumulating Data at An Almost Unimaginable Rate and From A
No ratings yet
The Data Explosion: Modern Computer Systems Are Accumulating Data at An Almost Unimaginable Rate and From A
14 pages
Text Summarization
No ratings yet
Text Summarization
6 pages
Operating
No ratings yet
Operating
3 pages
(IJCST-V3I4P21) : Ms - Pallavi.D.Patil, P.M.Mane
No ratings yet
(IJCST-V3I4P21) : Ms - Pallavi.D.Patil, P.M.Mane
7 pages
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
No ratings yet
A Domain-Specific Automatic Text Summarization Using Fuzzy Logic
13 pages
Lecture 8 - Supervised Learning in Neural Networks - (Part 1)
No ratings yet
Lecture 8 - Supervised Learning in Neural Networks - (Part 1)
7 pages
Personality Prediction Through CV Analysis Using Machine Learning Techniques1
No ratings yet
Personality Prediction Through CV Analysis Using Machine Learning Techniques1
7 pages
Text Summarization Using Word Frequency
No ratings yet
Text Summarization Using Word Frequency
3 pages
Assignment 3 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
8 pages
A Review Paper On Extractive Techniques of Text Summarization
No ratings yet
A Review Paper On Extractive Techniques of Text Summarization
4 pages
Mobility Management in Mobile Communication
No ratings yet
Mobility Management in Mobile Communication
7 pages
Defect Detection in Raw Hide and Wet Blue Leather
No ratings yet
Defect Detection in Raw Hide and Wet Blue Leather
6 pages
AlBadawy Detecting AI-Synthesized Speech Using Bispectral Analysis CVPRW 2019 Paper
No ratings yet
AlBadawy Detecting AI-Synthesized Speech Using Bispectral Analysis CVPRW 2019 Paper
7 pages
ML - Assignment No 1
No ratings yet
ML - Assignment No 1
3 pages
AI Exp2
No ratings yet
AI Exp2
6 pages
Discussion Guide No1variables
No ratings yet
Discussion Guide No1variables
2 pages
Integrating Learning Styles and Adaptive E-Learning System
No ratings yet
Integrating Learning Styles and Adaptive E-Learning System
9 pages
Mobile IP Packet To Packet Delivery
No ratings yet
Mobile IP Packet To Packet Delivery
5 pages
Word Sense Disambiguation: A Survey
No ratings yet
Word Sense Disambiguation: A Survey
16 pages
Syllabus (EEE534 2024)
No ratings yet
Syllabus (EEE534 2024)
6 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Products Reviews and Sentimental Analysis System For Ecommerce Website
No ratings yet
Products Reviews and Sentimental Analysis System For Ecommerce Website
3 pages
E-Mail Spam Detection and Classification Using SVM and Feature Extraction
No ratings yet
E-Mail Spam Detection and Classification Using SVM and Feature Extraction
5 pages
Automated Lidar Analysis Tools - Global Mapper PDF
No ratings yet
Automated Lidar Analysis Tools - Global Mapper PDF
2 pages
Design and Analysis of Algorithms: 1, #1
From Everand
Design and Analysis of Algorithms: 1, #1
S. R. Jena
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet

NLP Report

Uploaded by

NLP Report

Uploaded by

Excelssior Education Society’s

K.C. College Of Engineering and Management Studies and Research

Text Summarization using NLP

Submitted in partial fulfillment of the requirements of the degree

Aneesh Panchal (B-07)

Department of Computer Engineering

for the award of the degree of “Bachelor of Engineering” in “Computer Engineering”.

Head of The Department Principal

of “ Bachelor of Engineering” in “Computer Engineering”.

2 Literature Survey 10-11

2.1 Survey to Existing Project

3 Proposed Techniques 12-19

NLTK – Natural Language Tool Kit

Table 2.1 Literature Survey

Fig 3.2.1 Workflow of the program

1.3 Problem Statement

3.4 Details of Hardware and Software

# Define a function to read and preprocess the text

# Tokenize the text into sentences

# Tokenize sentences into words and remove stopwords

return sentences, word_tokens

# Define a function to calculate sentence similarity using cosine distance

for word in sent1:

for word in sent2:

return 1 - cosine_distance(vector1, vector2)

# Define the main function for text summarization

# Create a sentence similarity matrix

# Create a graph from the similarity matrix

# Generate ranked sentences using PageRank

# Sort sentences by their scores

# Select the top 'num_sentences' sentences as the summary

3.5 Conclusion and Future Work

You might also like