0% found this document useful (0 votes)

29 views23 pages

Application NLP

Uploaded by

dayanand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views23 pages

Application NLP

Uploaded by

dayanand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 23

Application of NLP in

Information Retrieval
Presentation Outline

 Overview of current IR Systems

 Problems with NLP in IR
 Major applications of NLP in IR
Motivation

 Most successful general purpose retrieval

methods are statistical methods.

 Sophisticated linguistic processing often

degrade performance.
What is IR ??
 “Information retrieval system is one that
searches a collection of natural language
documents with the goal of retrieving
exactly the set of documents that pertain
to a users question”

 Have their origins in library systems

 Do not attempt to deduce or generate
answers
The problem of IR
 Goal = find documents relevant to an information
need from a large document set Info.
need

Query
IR
Retrieval syste
Document Answer list
collection m

5
Basics of IR Systems
Basics of IR Systems (contd…)
 Indexing the collection of documents.

 Transforming the query in the same way

as the document content is represented.

 Comparing the description of each

document with that of the query.

 Listing the results in order of relevancy.

Basics of IR Systems (contd…)

 Retrieval Systems consist of mainly two

processes:
 Indexing
 Matching
Indexing
 Indexing is the process of selecting terms to
represent a text.

 Indexing involves:
 Tokenizationof string
 Removing frequent words
 Stemming (removing ing, ed, etc)

 Two common Indexing Techniques:

 Boolean Model
 Vector space model
Indexing
Information Retrieval Models
 A retrieval model consists of:
 D: representation for documents
 R: representation for queries
 F: a modeling framework for D, Q
 R(q, di): a ranking or similarity function which
orders the documents with respect to a
query.
 In this, tokens are treated in the form of 1’s
and 0’s
Boolean Model
 Queries are represented as Boolean
combinations of the terms.
 Set of documents that satisfied the
Boolean expression are retrieved in
response to the query.
 Drawback
 Useris given no indication as to whether some
documents in the retrieved set are likely to be
better than others in the set
Vector Space Model
 In this model documents and queries are
represented by vectors in T dimensional space.
 T is the number of distinct terms used in the
documents.
 Each axis corresponds to one term.
 Ranked list of documents ordered by similarity to
the query where similarity between a query and a
document is computed using a metric on the
respective vectors.
Matching
 Matching is the process of computing a measure
of similarity between two text representations.
 Relevance of a document is computed based on
following parameters:
 tf - term frequency is simply the number of times a
given term appears in that document.
tfi.j = (count of ith term in jth document)/(total terms in jth document)
 idf- inverse document frequency is a measure of the
general importance of the term
idfi = (total no. of documents)/(no. of documents containing ith term)
 tfidfi,j score = tf * idf
Evaluation of IR Systems
 Two common effectiveness measures include:
 Precision: Proportion of retrieved documents that are
relevant. (it is near to accuracy)
Precision= no.of retrieved relevant documents/total no.of
relevant documents
 Recall: Proportion of relevant documents that are
retrieved.
Recall= no.of retrieved relevant documents/total no.of
retrieved documents
 Ideally both precision and recall should be 1.
 In practice, these are inversely related.
Case Study
Query: I need to know the gas mileage for my audi a8 2004 model

Source: Yahoo search (search.yahoo.com)

Case Study (contd…)
Query: I need to know the gas mileage for my audi a8 2004 model

Source: Y!Q search (yq.search.yahoo.com)

Case Study (contd…)
Query: I need to know the gas mileage for my audi a8 2004 model

Source: Google search (www.google.com)

Case Study (contd…)
 Yahoo Search
 Puretext-based search.
 Result generates instance of same text containing
documents.
 Y!Q Search
 Use of semantics but not efficient.
 Attempts to generate answer. However this is done
less efficiently here.
 Google Search
 Efficientuse of NLP for deduction of answer form given
question.
 A step towards question-answering !!
Conclusion

 Research efforts to address appropriate

tasks are underway.
E.g. document summarization, generating
answers.

 Achieving extremely efficient NLP

techniques is an idealization.
References
 Voorhees, EM, "Natural Language Processing and Information Retrieval," in
Pazienza, MT (ed.), Information Extraction: Towards Scalable, Adaptable Systems,
New York: Springer, 1999.
 Salton G Wong A Yang CS A Vector Space Model for Automatic Indexing
Communications of the ACM (1975) 613-620.
 Mari Vallez; Rafael Pedraza-Jimenez. Natural Language Processing in Textual
Information Retrieval and Related Topics "Hipertext.net", num. 5, 2007.
 Sanjeet Khaitan, Kamaljeet Verma and Pushpak Bhattacharyya, Exploiting Semantic
Proximity for Information Retrieval, IJCAI 2007, Workshop on Cross Lingual
Information Access, Hyderabad, India, Jan, 2007.
 Wikipedia
Questions ??
Thank You !!!!!

NASA USAF Gemini LC-19 Activation Plan 1962
100% (4)
NASA USAF Gemini LC-19 Activation Plan 1962
411 pages
Quality Manual
100% (1)
Quality Manual
45 pages
Information Retrieval - 1
No ratings yet
Information Retrieval - 1
47 pages
Module 6 Updated Final
No ratings yet
Module 6 Updated Final
48 pages
Unit 5 6 Pages Notes
No ratings yet
Unit 5 6 Pages Notes
3 pages
Lang Models: 04 December 2024 23:03
No ratings yet
Lang Models: 04 December 2024 23:03
4 pages
NLP M5 Part-1 SPP
No ratings yet
NLP M5 Part-1 SPP
55 pages
Applications of NLP
No ratings yet
Applications of NLP
48 pages
Applications of NLP: Introduction To Natural Language Processing (CSE 5321)
No ratings yet
Applications of NLP: Introduction To Natural Language Processing (CSE 5321)
59 pages
Unit V Notes Adbt Adbt
No ratings yet
Unit V Notes Adbt Adbt
7 pages
Module 3-2
No ratings yet
Module 3-2
17 pages
Introduction of IR Models
No ratings yet
Introduction of IR Models
67 pages
NLP - Module 5
No ratings yet
NLP - Module 5
58 pages
Chapter 8 - Applications of NLP-3
No ratings yet
Chapter 8 - Applications of NLP-3
72 pages
Ch2 - IR and LT
No ratings yet
Ch2 - IR and LT
45 pages
What Is Information Retrieval (IR)
No ratings yet
What Is Information Retrieval (IR)
15 pages
Chapter 8 - Applications of NLP
No ratings yet
Chapter 8 - Applications of NLP
72 pages
Introduction To Information Retrieval
No ratings yet
Introduction To Information Retrieval
50 pages
Chapter 1 Ir
No ratings yet
Chapter 1 Ir
37 pages
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
No ratings yet
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
46 pages
Information Retrieval: Adt-V Unit
No ratings yet
Information Retrieval: Adt-V Unit
106 pages
Unit V Easy To Learn
No ratings yet
Unit V Easy To Learn
21 pages
Module 7
No ratings yet
Module 7
53 pages
Information Retrieval Notes
No ratings yet
Information Retrieval Notes
42 pages
Bulu
No ratings yet
Bulu
47 pages
NLP See
No ratings yet
NLP See
9 pages
2 Introduction To Information Retrieval
No ratings yet
2 Introduction To Information Retrieval
38 pages
Unit 1: Introduction and Data Pre-Processing
No ratings yet
Unit 1: Introduction and Data Pre-Processing
71 pages
4 IRModels
No ratings yet
4 IRModels
46 pages
Lecture 07
No ratings yet
Lecture 07
59 pages
Lecture 07
No ratings yet
Lecture 07
59 pages
Informaiton Retrieval and Web Search
No ratings yet
Informaiton Retrieval and Web Search
44 pages
Information Retrieval
No ratings yet
Information Retrieval
72 pages
Chapter #7 Applicatios of NLP (Reading Ass)
No ratings yet
Chapter #7 Applicatios of NLP (Reading Ass)
58 pages
Introduction of IR Models
No ratings yet
Introduction of IR Models
62 pages
AI Module 7
No ratings yet
AI Module 7
76 pages
Chapter 1 Introduction To ISR
No ratings yet
Chapter 1 Introduction To ISR
39 pages
Chapter 1
No ratings yet
Chapter 1
52 pages
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
No ratings yet
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
77 pages
Information Retrival List of Experiment - Odd Sem 2024-25
No ratings yet
Information Retrival List of Experiment - Odd Sem 2024-25
23 pages
Information Retrieval
No ratings yet
Information Retrieval
5 pages
Applications of NLP
No ratings yet
Applications of NLP
85 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Unit Iii - Information Retrieval Design Features of Information Retrieval Systems
No ratings yet
Unit Iii - Information Retrieval Design Features of Information Retrieval Systems
57 pages
PPT08-Natural Language Processing
100% (1)
PPT08-Natural Language Processing
44 pages
Unit-1-Natural Language Processing Applications
No ratings yet
Unit-1-Natural Language Processing Applications
63 pages
NLP See
No ratings yet
NLP See
27 pages
Unit Ii Part B 1. Write About Basic IR Model
No ratings yet
Unit Ii Part B 1. Write About Basic IR Model
17 pages
IR Unit II
No ratings yet
IR Unit II
4 pages
Artificial Intelligence in Information Retrieval
No ratings yet
Artificial Intelligence in Information Retrieval
5 pages
4 IRModels
No ratings yet
4 IRModels
32 pages
Information Retrievalpdf
No ratings yet
Information Retrievalpdf
7 pages
1 IR Introductionn
No ratings yet
1 IR Introductionn
30 pages
CompletedUNIT 1 PPT 10.7.17
100% (6)
CompletedUNIT 1 PPT 10.7.17
87 pages
Week 2 - Information Retrieval Basics
No ratings yet
Week 2 - Information Retrieval Basics
74 pages
Completed Unit II 17.7.17
No ratings yet
Completed Unit II 17.7.17
113 pages
Information Retrieval Systems
No ratings yet
Information Retrieval Systems
46 pages
ISE Information Retrieval Mod-V
No ratings yet
ISE Information Retrieval Mod-V
48 pages
ISR Chap..1
No ratings yet
ISR Chap..1
27 pages
1 IR Introduction
No ratings yet
1 IR Introduction
23 pages
ISE Information Retrieval Mod-V (Uploaded by Snaptricks - In)
No ratings yet
ISE Information Retrieval Mod-V (Uploaded by Snaptricks - In)
48 pages
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Mining Process
No ratings yet
Data Mining Process
12 pages
Uni QuestionPaper
No ratings yet
Uni QuestionPaper
1 page
Univ Paper 25
No ratings yet
Univ Paper 25
1 page
Ads Module 4 Smote 2023
No ratings yet
Ads Module 4 Smote 2023
71 pages
Becm Ads Test2
No ratings yet
Becm Ads Test2
1 page
Time Series
No ratings yet
Time Series
37 pages
Recommendation Engines
No ratings yet
Recommendation Engines
27 pages
Feature Selection
No ratings yet
Feature Selection
22 pages
Deep Learning
No ratings yet
Deep Learning
32 pages
Text Mining
No ratings yet
Text Mining
21 pages
Lect1 Intro 3jan08
No ratings yet
Lect1 Intro 3jan08
94 pages
Pragmatic
No ratings yet
Pragmatic
60 pages
Pragmatic-Reference Resolution
No ratings yet
Pragmatic-Reference Resolution
23 pages
A Mini Project On Indian Railways in Management Information System
No ratings yet
A Mini Project On Indian Railways in Management Information System
14 pages
TestSuite ApplicationTest DOC v10 en
No ratings yet
TestSuite ApplicationTest DOC v10 en
14 pages
NS2-DH01-P0ZEN-140003 - ITP FOR ELECTRICAL EQUIPMENT (MV, LV, PANEL, CUBICLE) - Rev.D
No ratings yet
NS2-DH01-P0ZEN-140003 - ITP FOR ELECTRICAL EQUIPMENT (MV, LV, PANEL, CUBICLE) - Rev.D
10 pages
WSN Brochure
No ratings yet
WSN Brochure
2 pages
WATO EX-20&30&35 Service Manual V9.0 en
No ratings yet
WATO EX-20&30&35 Service Manual V9.0 en
280 pages
Oracle Fusion Expenses Android
No ratings yet
Oracle Fusion Expenses Android
7 pages
11-C264 ISaGRAF - Rev G
No ratings yet
11-C264 ISaGRAF - Rev G
36 pages
English For ICT Students New One
No ratings yet
English For ICT Students New One
101 pages
Tabela Aplicações Velas Lucas
No ratings yet
Tabela Aplicações Velas Lucas
2 pages
Change Control Process Template
No ratings yet
Change Control Process Template
9 pages
Uid 2maks With Answer
No ratings yet
Uid 2maks With Answer
18 pages
Method Statement Moorreesburg (002) (AutoRecovered)
No ratings yet
Method Statement Moorreesburg (002) (AutoRecovered)
12 pages
Wheel Loader PDF
100% (3)
Wheel Loader PDF
28 pages
Siemens Fire Detector
No ratings yet
Siemens Fire Detector
10 pages
Write Up - AI For Everyone
No ratings yet
Write Up - AI For Everyone
3 pages
Address Space: Ipv4 Addresses
No ratings yet
Address Space: Ipv4 Addresses
9 pages
ENG - NTT Global Data Centers EMEA - Presentation
No ratings yet
ENG - NTT Global Data Centers EMEA - Presentation
10 pages
Product Info WISI-VX-26-H en
No ratings yet
Product Info WISI-VX-26-H en
2 pages
Engine Design Case Study
No ratings yet
Engine Design Case Study
4 pages
FSCQ0765RT PDF
No ratings yet
FSCQ0765RT PDF
25 pages
Iso 3834 2 2021'
50% (2)
Iso 3834 2 2021'
14 pages
CT-2 Question Paper-Batch 2 MP Rev1
No ratings yet
CT-2 Question Paper-Batch 2 MP Rev1
1 page
Hardware's Role in Virtual Instrumentation
No ratings yet
Hardware's Role in Virtual Instrumentation
6 pages
Cosy TPL Ewon
100% (1)
Cosy TPL Ewon
3 pages
Automatic Fire Detection and Suppression System
No ratings yet
Automatic Fire Detection and Suppression System
2 pages
MA78 Digi LR
No ratings yet
MA78 Digi LR
230 pages
Motion Control System
No ratings yet
Motion Control System
24 pages
FSD S4 STR I STR 001 Stock Transfer Interface
No ratings yet
FSD S4 STR I STR 001 Stock Transfer Interface
13 pages

Application NLP

Uploaded by

Application NLP

Uploaded by

Application of NLP in

 Overview of current IR Systems

 Most successful general purpose retrieval

 Sophisticated linguistic processing often

 Have their origins in library systems

 Transforming the query in the same way

 Comparing the description of each

 Listing the results in order of relevancy.

 Retrieval Systems consist of mainly two

 Two common Indexing Techniques:

Source: Yahoo search (search.yahoo.com)

Source: Y!Q search (yq.search.yahoo.com)

Source: Google search (www.google.com)

 Research efforts to address appropriate

 Achieving extremely efficient NLP

You might also like