Laboratory Practice VI Natural Language Processing

The document outlines a mini project on developing a sentence autocompletion model using Natural Language Processing (NLP) techniques. The project aims to enhance user experience by providing automatic suggestions for frequently asked questions in customer service interactions. It details the dataset used, the implementation of TF-IDF for text processing, and the results demonstrating the effectiveness of the autocomplete feature in improving typing efficiency.

Uploaded by

Bhushan Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views8 pages

Laboratory Practice VI Natural Language Processing

Uploaded by

Bhushan Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Laboratory Practice VI

Natural Language Processing

Mini Project
Sentence Autocompletion

Group Members
Omkar Jagtap(19CO033)
Shreya Jagtap(19CO035)
Abhishek Mulik(19CO051)
Tanvi Paigude(19CO066)

Class: BE Computer A

Faculty:
Aim:
To develop a sentence autocompletion model.

Abstract:
Imagine that you were a representative replying to customer online and you are
asking the same questions over and over to your customer. Would you like to get
automatic suggestions instead of typing the same thing again and again ?

An autocomplete can be helpful, faster, convenient, and correct any grammatical /

spelling error at the same time.

Introduction:
Autocomplete is a user interface function in which an application predicts a
word or phrase that the user needs to type without the user having to type it
entirely.
In modern applications, word completion or autocomplete or autosuggest is a
popular user interface feature. Its aim is to predict what the user wants to type
and add sections of the text automatically.
By providing available options, the aim is to speed up typing, assist those with
typing problems, correct/prevent spelling errors, and promote information
retrieval. Witten and Daraghs’ work on the Reactive Keyboard from 1983 may
be the earliest example of the concept. Several other methods have been
identified since then, but the basic concept has remained the same.
Word processors (MS Word, OpenOffice.org), programming editors (Emacs,
Eclipse), desktop applications (web browsers, e-mail clients), HTML form
elements on websites, web applications (Google Suggest, web-based e-mail
clients), mobile phone interfaces, Unix terminals, and so on all have the
feature.

Whenever you search something on Google, after typing 2-3 letters, it indicates
the feasible search terms. Or in case you look for something with typos, it
corrects them and still finds relevant results for you. Isn’t it amazing?

It is something that makes us every day however by no means will pay lots of
interest to it. It is an important application of natural language processing and
a splendid occurrence of what it is far meaning for a great many all
throughout the planet, including you and me. Search autocomplete and
autocorrect each help usi discovering right results much productively.
Presently, various gatherings have additionally begun utilizing this element on
their sites, as Facebook and Quora.

Dataset
The file contains 22K conversations between a customer and a representative.
For the purpose of this project, we are only interested in completing the
threads of the representative.
Data Selection and Cleaning:
The information is reaching to partition the strings from the client and the
representative, separate the sentences based on the accentuation (we are
going to keep the accentuation), the ultimate content will be cleaned up with a
few light regex and as it were the sentence bigger than 1 word will be kept.
Finally, since the agent has the inclination to inquire the same question over
and over once more, the autocomplete is amazingly valuable by proposing a
complete sentence. In our case, we are going to check the number of events of
the same sentence so ready to utilize it as a highlight afterward and erase the
duplicates.

Implementation:
Generate TFI-DF vectorizer:
In information retrieval, tf–idf or TFIDF, full forming as term frequency– inverse
document frequency, is nothing but a numerical statistic that is intended to
reflect how important a word is to a document in a collection or corpus. The
most common use of the tf-idf. The tf–idf value increases in direct proportion
to the number of times a term appears in the document and is compensated
by the number of documents in the corpus that contain the word, which tends
to justify the fact that certain words appear more frequently in general.TF-IDF
weight speaks to the relative significance of a term within the document and
whole corpus. TF stands for Term

Frequency:
It calculates how as often as possible a term appears in a document. Since,
each document size varies, a term may show up more in a long-sized document
than
a brief one. In this way, the length of the document frequently separates Term
frequency.

IDF (Inverse Document Frequency):

When a word appears in all the records, it is of no use. The words "the," "an,"
"on," "of," and "a" are just a few examples. They appear often in a text but are
of minor significance. The importance of these terms is reduced by IDF, while
the importance of uncommon terms is increased. The more the value of IDF,
the more distinct the term becomes.

Term Frequency-Inverse Report Recurrence:

TF-IDF works by penalizing the foremost commonly occurring words by
allotting them less weightage whereas giving high weightage to terms, which
are present within the legitimate subset of the corpus, and has high event in a
specific document. Frequency and Inverse Document Frequency is the product.
Use of Ranking Function
The autocomplete is calculating the similarity between the sentence in the
data and the prefix of the sentence written by the representative. As a weight
feature, we chose to reorder using the frequency of the most common similar
sentence. Cosine similarity is the easiest approach to ascertain the similarity, in
this technique, in the query for each sentence, we add tf-idf values of the
tokens. For example, if the query “hello world,” we need to check in every
sentence if these words exist and if the word exists, then we calculate
similarity score using linear kernel we will sort and take the top 3 sentences.
Results:
Conclusion:
In our project we gave successfully implemented autocomplete using NLP.
As seen, NLP is an important part of our life and autocomplete is one of its
applications which is of real help to us and has made human typing easier.
This project helps save a lot of time and effort in commercial world as well.

MATH 5 - Q1 - Mod1 PDF
78% (49)
MATH 5 - Q1 - Mod1 PDF
25 pages
Applications of NLP
No ratings yet
Applications of NLP
85 pages
IRS Unit 4
No ratings yet
IRS Unit 4
63 pages
IR Journal
No ratings yet
IR Journal
36 pages
Certificate: T.Y.Bsc Cs
No ratings yet
Certificate: T.Y.Bsc Cs
120 pages
Lect 08
No ratings yet
Lect 08
17 pages
Irs Assignment
No ratings yet
Irs Assignment
12 pages
Irt Ia 2
No ratings yet
Irt Ia 2
9 pages
Query Expansion
No ratings yet
Query Expansion
31 pages
Sample Study Matter JEE (Advanced) PDF
100% (1)
Sample Study Matter JEE (Advanced) PDF
89 pages
Final Report
No ratings yet
Final Report
59 pages
Chapter #7 Applicatios of NLP (Reading Ass)
No ratings yet
Chapter #7 Applicatios of NLP (Reading Ass)
58 pages
Text Similarity Algorithms
No ratings yet
Text Similarity Algorithms
28 pages
Project Report
No ratings yet
Project Report
12 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Carrier VRF Xct7 2022
No ratings yet
Carrier VRF Xct7 2022
186 pages
NLP Ir
No ratings yet
NLP Ir
24 pages
Application NLP
No ratings yet
Application NLP
23 pages
Sentence Autocomplete Using Transfer Learning
No ratings yet
Sentence Autocomplete Using Transfer Learning
4 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
2 Introduction To Information Retrieval
No ratings yet
2 Introduction To Information Retrieval
38 pages
Module 05 - Learners Guide
No ratings yet
Module 05 - Learners Guide
31 pages
Module III
No ratings yet
Module III
42 pages
Getting Started With Natural Language Processing
No ratings yet
Getting Started With Natural Language Processing
10 pages
Text Pre Processing With NLTK
No ratings yet
Text Pre Processing With NLTK
42 pages
SANS MGT414 10 Course Book
No ratings yet
SANS MGT414 10 Course Book
100 pages
IR
No ratings yet
IR
5 pages
IJDKP
No ratings yet
IJDKP
7 pages
IMA2023109 - Imagine Invoice 132432 - Thecaratshop
No ratings yet
IMA2023109 - Imagine Invoice 132432 - Thecaratshop
1 page
OSS Engine Parts Section
No ratings yet
OSS Engine Parts Section
28 pages
General Description: ISO 17987/LIN 2.x/SAE J2602 Transceiver
100% (1)
General Description: ISO 17987/LIN 2.x/SAE J2602 Transceiver
24 pages
Session 11-12 - Text Analytics
No ratings yet
Session 11-12 - Text Analytics
38 pages
Tata Play Packs 20240904 - 0
No ratings yet
Tata Play Packs 20240904 - 0
243 pages
Natural Language Processing (NLP) Introduction:: Top 10 NLP Interview Questions For Beginners
No ratings yet
Natural Language Processing (NLP) Introduction:: Top 10 NLP Interview Questions For Beginners
24 pages
Insurance Awareness Handouts - Basics of Insurance
No ratings yet
Insurance Awareness Handouts - Basics of Insurance
8 pages
Form 60
No ratings yet
Form 60
1 page
The Threats To The Objectivity in Internal Auditing
No ratings yet
The Threats To The Objectivity in Internal Auditing
2 pages
300 Ohm Twin-Lead J-Pole Portable Antenna
No ratings yet
300 Ohm Twin-Lead J-Pole Portable Antenna
3 pages
Sessional Marks (Theory)
0% (1)
Sessional Marks (Theory)
1 page
IClebo Arte User Guide-English
No ratings yet
IClebo Arte User Guide-English
20 pages
Art and Technology in Poland Ed. Agnieszka Jelewska
No ratings yet
Art and Technology in Poland Ed. Agnieszka Jelewska
258 pages
1.1 Identify Ty
No ratings yet
1.1 Identify Ty
7 pages
Rez Sisters Thesis
100% (3)
Rez Sisters Thesis
7 pages
The Role of Frontier Orbitals in Chemical Reactions
No ratings yet
The Role of Frontier Orbitals in Chemical Reactions
18 pages
AE 814 Compliance of Draft Construction Stage Report For TMP (PKG-I To III)
No ratings yet
AE 814 Compliance of Draft Construction Stage Report For TMP (PKG-I To III)
12 pages
Arroyo Housing Project
No ratings yet
Arroyo Housing Project
20 pages
Intermediary Liability in A Global World: Prof. Dr. Matthias Leistner, LL.M. (Cambridge)
No ratings yet
Intermediary Liability in A Global World: Prof. Dr. Matthias Leistner, LL.M. (Cambridge)
40 pages
Lower Secondary Science Revision Guide Secondary 1 Sample Pages 9781398364219
No ratings yet
Lower Secondary Science Revision Guide Secondary 1 Sample Pages 9781398364219
23 pages
Slidesgo Unlocking The Future The Impact of Ai and Machine Learning Technology 20241129165854RC3Y
No ratings yet
Slidesgo Unlocking The Future The Impact of Ai and Machine Learning Technology 20241129165854RC3Y
11 pages
Taxi Reimbursement Request Form 07.31.24 - 0
No ratings yet
Taxi Reimbursement Request Form 07.31.24 - 0
2 pages
Major Assignment 1
No ratings yet
Major Assignment 1
4 pages
Message Analyzer FAQ and Known Issues
No ratings yet
Message Analyzer FAQ and Known Issues
11 pages
Banana - Mail Arte - Flue - v4 - n3-4 - 1984
No ratings yet
Banana - Mail Arte - Flue - v4 - n3-4 - 1984
3 pages
Book 1
No ratings yet
Book 1
2 pages
Bavleen Revised
No ratings yet
Bavleen Revised
4 pages
5
No ratings yet
5
1 page
9
No ratings yet
9
1 page
1
No ratings yet
1
1 page
Project Brief 1
No ratings yet
Project Brief 1
2 pages
Home Sweet Compromise
No ratings yet
Home Sweet Compromise
7 pages
14
No ratings yet
14
1 page
14
No ratings yet
14
1 page
HTML 5
No ratings yet
HTML 5
1 page
LLM Prompt Engineering for Developers: The Art and Science of Unlocking LLMs' True Potential
From Everand
LLM Prompt Engineering for Developers: The Art and Science of Unlocking LLMs' True Potential
Aymen El Amri
No ratings yet
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
From Everand
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
Lorenzo Bettini
4/5 (1)
Linux, Apache, MySQL, PHP Performance End to End
From Everand
Linux, Apache, MySQL, PHP Performance End to End
Colin McKinnon
5/5 (1)
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
From Everand
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
Eric Elliott
No ratings yet
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
From Everand
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
Alexandra George
No ratings yet
OpenAI GPT For Python Developers: The art and science of building AI-powered apps with GPT-4, Whisper, Weaviate, and beyond.
From Everand
OpenAI GPT For Python Developers: The art and science of building AI-powered apps with GPT-4, Whisper, Weaviate, and beyond.
Aymen El Amri
No ratings yet
SDL Trados Studio – A Practical Guide
From Everand
SDL Trados Studio – A Practical Guide
Andy Walker
5/5 (1)
Software Development Accelerated Essentials: What You Didn't Know, You Needed to Know
From Everand
Software Development Accelerated Essentials: What You Didn't Know, You Needed to Know
Ed Gomez
No ratings yet
F# for Machine Learning Essentials: Get up and running with machine learning with F# in a fun and functional way
From Everand
F# for Machine Learning Essentials: Get up and running with machine learning with F# in a fun and functional way
Sudipta Mukherjee
No ratings yet
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
From Everand
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
James Tudor
5/5 (1)
Crystal Clear Java: 1St Edition
From Everand
Crystal Clear Java: 1St Edition
Mohammed Ashequr Rahman
No ratings yet
Learning F# Functional Data Structures and Algorithms
From Everand
Learning F# Functional Data Structures and Algorithms
Adnan Masood
No ratings yet
The Most Concise Step-By-Step Guide To ChatGPT Ever
From Everand
The Most Concise Step-By-Step Guide To ChatGPT Ever
G.A. Pimpleton
3.5/5 (3)
Learning .NET High-performance Programming
From Everand
Learning .NET High-performance Programming
Antonio Esposito
No ratings yet
iOS Programming Nuts and bolts
From Everand
iOS Programming Nuts and bolts
Keith Lee
4/5 (1)
Programming in Star
From Everand
Programming in Star
Francis McCabe
No ratings yet
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
From Everand
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
Eric Vargas
No ratings yet
Subject Oriented Programming
From Everand
Subject Oriented Programming
Godwin Ani
No ratings yet
Terminology Extraction for Translation and Interpretation Made Easy: How to use ChatGPT and other low-cost, web-based programs to create terminology extraction lists and glossaries quickly and easily
From Everand
Terminology Extraction for Translation and Interpretation Made Easy: How to use ChatGPT and other low-cost, web-based programs to create terminology extraction lists and glossaries quickly and easily
Uwe Muegge
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
The Art Of Conversation With ChatGPT
From Everand
The Art Of Conversation With ChatGPT
Hakan SAĞLIK
No ratings yet
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Swift Programming Nuts and bolts
From Everand
Swift Programming Nuts and bolts
Keith Lee
No ratings yet
TypeScript Interview Playbook
From Everand
TypeScript Interview Playbook
Tech Interviews
No ratings yet
Introduction to Programming Languages
From Everand
Introduction to Programming Languages
IntroBooks Team
4/5 (1)
Absolute Beginner's Python Programming: The Illustrated Guide to Learning Computer Programming
From Everand
Absolute Beginner's Python Programming: The Illustrated Guide to Learning Computer Programming
Kevin Wilson
1/5 (1)
Using Microsoft Word - 2023 Edition: The Step-by-step Guide to Using Microsoft Word
From Everand
Using Microsoft Word - 2023 Edition: The Step-by-step Guide to Using Microsoft Word
Kevin Wilson
No ratings yet
Python For Data Science
From Everand
Python For Data Science
Kevin Clark
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Perceptual Computing: Fundamentals and Applications
From Everand
Perceptual Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
Natural Language User Interface: Fundamentals and Applications
From Everand
Natural Language User Interface: Fundamentals and Applications
Fouad Sabry
No ratings yet
Unlocking Your Potential with ChatGPT
From Everand
Unlocking Your Potential with ChatGPT
Bill Vincent
No ratings yet

Laboratory Practice VI Natural Language Processing

Uploaded by

Laboratory Practice VI Natural Language Processing

Uploaded by

Laboratory Practice VI

Natural Language Processing

An autocomplete can be helpful, faster, convenient, and correct any grammatical /

IDF (Inverse Document Frequency):

Term Frequency-Inverse Report Recurrence:

You might also like