0% found this document useful (0 votes)

10 views72 pages

Chapter 8 - Applications of NLP-3

Chapter 8 discusses various applications of Natural Language Processing (NLP) in Information Retrieval (IR), detailing classic IR models such as Boolean, Vector, and Probabilistic models. It highlights the evolution of IR systems from handcrafted features to deep learning approaches, which enhance document ranking, indexing, and query processing. Additionally, the chapter covers the importance of NLP in improving IR performance through techniques like lexical analysis, stop word removal, and named entity recognition.

Uploaded by

Haylisha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views72 pages

Chapter 8 - Applications of NLP-3

Uploaded by

Haylisha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 72

Chapter – 8

Applications of NLP – Part I

Department of Computer Science

School of Computing
Dire Dawa Institute of Technology
Dire Dawa University

I
Information
The Retrieval
Retrieval
Process Classic IR
Information
Models
Extraction Machine
IR Performance
Translation
Evaluation NLP in IR
Question-Answering and Dialogue
The Retrieval S ystem s
Text Summarization
Process

 Information Retrieval (IR) provides a list of potentially relevant documents in response

to users query.

User Interface
user need
Text Operations text text

user feedback logical view logical view

Query Operations Indexing DB Manager Module
query inverted file

Searching Index
Text
ranked docs retrieved
Database
docs
Ranking

The process of retrieving information [Baeza-Yates & Ribeiro-Neto]

Department of Computer Science, SC, DDIT, DDU Applications of NLP 2/59

Information The Retrieval Process
Retrieval Classic IR Models
Information Deep Learning approaches
Extraction Machine IR Performance Evaluation
Translation NLP in IR
Question-Answering and Dialogue
Classic IR S ystem s
Text Summarization
Models

 Classic IR models consider that each document is described by a set of representative

keywords called index terms.
 Index terms have the following characteristics

 (Document) words whose semantics help in remembering the documents’ main

themes.
 Used to index and summarize the documents.

 Mainly nouns because nouns have meaning by themselves.

 Depending on how index terms are treated, there are three classic IR models: Boolean,
Vector and Probabilistic models.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 3/59

 Based on set theoretic (set theory and Boolean algebra) concepts.

 Documents and queries are represented as a set of index terms.

 Queries are specified as Boolean expressions which have precise semantics.

 Similarity of documents to queries is measured with exact matching.

 Retrieval strategy is based on binary decisions (relevant or non-relevant).

 Considers that index terms are present or absent.

 Index terms are linked by three connectives (AND, OR, NOT).

 Simple but not efficient.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 4/59

 Based on algebraic concepts.

 Documents and queries are represented as vectors in a t-dimensional vector space.

 Recognizes that the use of binary weights is too limiting.

 Assigns non-binary weights to index terms in queries and documents.

 These term weights are used to compute the degree of similarity each
document stored in the systems and the user query.
 Retrieved documents can be sorted in decreasing order to get ranked list of
documents.
 Advantages:

 Its term-weighting scheme improves retrieval performance.

 Its partial matching strategy allows retrieval of documents that approximate
query conditions.
 Ranking of retrieved documents.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 5/59

 Based on probabilistic concepts.

 Captures the IR problem with the assumption that for a given user query there is a set
of documents which contains exactly the relevant documents and no other.
 This set of documents is the ideal answer set.
 Given the description of this ideal answer set, there would be no problem in
retrieving its documents.
 The querying process can then be thought of as a process of specifying the properties
of an ideal answer set.
 Initially guess the properties (we can start by using index terms).
 User feed back is then initiated and taken to improve the probability that the
user will find document d with query q.
 Measure of document similarity to the query:
P(d relevant-to q)
P(d non-relevant-to q)
 The main advantage is that documents are ranked in decreasing order of their
probability of being relevant.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 6/59

 The journey of an IR process begins with a user query sent to the IR system which
encodes the query, compares the query with the available resources, and returns the
most relevant pieces of information. Thus, the system is equipped with the ability to
store, retrieve and maintain information.
 In the early era of IR, the whole process was completed using handcrafted features and
ad-hoc relevance measures.
 Later, principled frameworks for relevance measure were developed with statistical
learning as a basis.
 Recently, deep learning has proven essential to the introduction of more opportunities to
IR. This is because data-driven features combined with data-driven relevance measures
can effectively eliminate the human bias in either feature or relevance measure design.
 Deep Learning is used for IR for all components.

 Document Ranking
 Document Indexing
 Query Processing
 Document Searching

Department of Computer Science, SC, DDIT, DDU Applications of NLP 7/59

 For example: Document Ranking

 Traditional IR models include basic handcrafted retrieval models, semantic-based

models, term dependency-based models, and learning to rank models.
 The deep learning approaches, on the other hand, involve methods of
representation learning, methods of matching function learning, and methods of
relevance learning.
 The capability of neural ranking models to extract features directly from raw text
inputs overcomes many limitations of traditional IR models that rely on
handcrafted features.
 Moreover, the deep learning methods manage to capture complicated matching
patterns for document ranking.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 8/59

 The performance of IR systems can be evaluated by using two commonly used metrics:
precision and recall.
 Recall is the fraction of the relevant documents which has been retrieved.
relevant  retrieved
R ecall =
relevant

 Precision is the fraction of the retrieved documents

which is relevant.
relevant  retrieved
Precision =
retrieved

Department of Computer Science, SC, DDIT, DDU Applications of NLP 9/59

 NLP is widely used to improve the performance of IR systems.

 Most commonly used applications are:

 Lexical analysis: with the objective of treating digits, hyphens, punctuation

marks, and the case of letters.

 Stop word removal: with the objective of filtering out words with very low
discrimination values for retrieval purpose.
 Stemming: with the objective of removing affixes.
 Automatic indexing: with the objective of determining representative words

(or groups of words (usually noun groups).

 Document clustering: with the objective of building the relationships
between
documents.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 10/59

 Construction of lexical relationships: with the objective of building term

categorization structures such as thesaurus which allows
the expansion of the original query with related terms.
 Improves recall and precision.

 Synonymy: absence of synonymy relationship yields poor recall from

missing synonymous documents.
 Polysemy/homonymy: terms yield poor precision results from search
terms that have multiple meanings leading to retrieval of non-
relevant documents.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 11/59

Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Components of Information S ystem s Filling
Text Summarization
Extraction

 Information Extraction (IE) focuses on the recognition, tagging, and extraction of certain
key elements of information (e.g. persons, companies, locations, organizations, etc.)
from large collections of text into a structured representation.

Example: Firm XYZ is a full service advertising agency specializing in direct

and
interactive marketing. Located in Bole, Addis Ababa, Firm XYZ is
Text: looking for an Assistant Account Manager to help manage
and coordinate
interactive marketing initiatives. Experience in online
marketing and/or the
advertising field is a plus. Depending on the experiences
of the applicants,
the company pays an attractive salary of Birr 3,000- Birr 5,000 per
month.

Extracted Information:

INDUSTRY: Advertising
POSITION: Assistant
Account Manager LOCATION:
Bole, Addis Ababa.
Department of Computer Science, SC, DDIT, DDU Applications of NLP 12/59
Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Components of Information S ystem s Filling
Text Summarization
Extraction

 IE is applied to a narrowly restricted domain.

 It has the following subtasks:

 Named Entity Recognition: recognition of entity names.

 Relation Detection and Classification: identification of relations between entities.

 Coreference and Anaphoric Resolution: resolving links to previously named

entities.
 Temporal and Event Processing: recognizing temporal expressions and analyzing
events.
 Template Filling: filling in the extracted information.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 13/59

Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Named Entity S ystem s Filling
Text Summarization
Recognition

 Named Entity Recognition is the process of recognition of entity names such as:

 People: Abebe, Kebede, አበበ, ከበደ, etc.

 Organization: Ministry of Education, ABC Company, ትምህርት ሚኒስቴር, ሀለመ ኩባንያ, etc.

 Place: Addis Ababa, Megenagna, አዲስ አበባ, መገናኛ, etc.

 Time expression: Tuesday, February 14, ማክሰኞ, የካቲት 6, etc.

 Quantities: three quintals of teff, 3000 Birr, ሶስት ኩንታል ጤፍ, 3ሺ ብር, etc.

 etc.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 14/59

Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Relation Detection and S ystem s Filling
Text Summarization
Classification

 Relation Detection and Classification involves identification of relations between entities.

 Examples:

 PERSON works for ORGANIZATION

 ORGANIZATION located in PLACE
 PERSON lives in PLACE
 SALARY amounts to QUANTITY
 PERSON is paid SALARY
 PERSON works for ORGANIZATION since DATE

Department of Computer Science, SC, DDIT, DDU Applications of NLP 15/59

Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Temporal and Event S ystem s Filling
Text Summarization
Processing

 Temporal and Event Processing recognizes and normalizes temporal expressions and
analyzes events.
 It has three major components:

 Temporal Expression Recognition

Examples:
He was born on October
2, 1938.
He was born in the
middle of the Second
Italo-Ethiopian War.
He was born two years after the Second Italo-Ethiopian War broke.
The pump circulates the water every 2 hours.
 Temporal Normalization

 Event Detection and Analysis

Department of Computer Science, SC, DDIT, DDU Applications of NLP 16/59

Information Components of Information
Retrieval Extraction Na m ed Entity
Information Recognition
Extraction Machine Relation Detection and
Translation Classification Temporal and
Question-Answering and Dialogue Event Processing Template
Temporal and Event S ystem s Filling
Text Summarization
Processing

 Template Filling is the final task of information extraction systems where structured
data is to be filled in the template slots.
 Example:

Department of Computer Science, SC, DDIT, DDU Applications of NLP 17/59

 Machine Translation (MT) refers to a translation of texts from one natural language to
another by means of a computerized system.
 MT is one of the earliest studied applications of NLP.

 Most important applications of MT are:

 Web-based translation services

 Spoken language translation services
 Although MT for resource poor/under resourced languages is currently under
intensive research and development, it far from being a solved problem.
 Commonly used approaches to MT are:

 Rule-based (involving direct, transfer, and Interlingua) translations

 Statistical translations
 Example-based translations
 Hybrid approaches
 Deep learning approaches

Department of Computer Science, SC, DDIT, DDU Applications of NLP 18/59

 Direct (dictionary-based) translation uses a large bilingual dictionary and translates the
source language text word-by-word.
 The process of direct translation involves morphological analysis, lexical transfer, local
reordering, and morphological generation.

Abebe { 3Pe r+M as c + S i n g} አበበ {3Per+Masc+Sing} አበበ {3Per+Masc+Sing} Aበበ

PA S T (break) PAST(ስብር) መስኮት {Object} መስኮት[ኡ][ን]
{3Per+Masc+Sing}
the ኡ/ው ኡ/ው
window መስኮት PAST(ስብር) ሰበር[ኧ]
[ው]

Morphologic Lexica Local Morphologic

al l real
analysis transf ordering generation
er
Source Target text
text
Abebe broke the አበበ መስኮቱን ሰበረው
window

Department of Computer Science, SC, DDIT, DDU Applications of NLP 17/5

Pros and Cons

 Pros:

 Fast
 Simple
 Inexpensive
 No translation rules hidden in lexicon
 Cons:

 Unreliable
 Not powerful
 Rule proliferation
 Requires too much context
 Major restructuring after lexical substitution

Department of Computer Science, SC, DDIT, DDU Applications of NLP 20/59

 Transfer-based translation uses an intermediate representation that captures the

structure of the original text in order to generate the correct translation.
 The process of transfer-based translation involves analysis, structural transfer, and
generation.
 The structural transfer can be made at two levels:

 Superficial (syntactic) transfer

 Transfers syntactic structures between the source and target
languages.
 Suitable for translation between closely related languages
 Deep (semantic) transfer
 Transfers semantic structures between the source and target
languages.
 Used for translation between distantly related languages.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 21/59

Superficial (Syntactic) Transfer

NP VP S

N V NP NP VP
Syntactic Syntactic Syntactic
Abebe broke Det N Structure Transfer Structure N N V

the Abebe the window

window broke
Word Lexica Word
Structu l Structu
re Transfe re
r

Source text Target text

Abebe broke the አበበ መስኮቱን ሰበረው
window

Department of Computer Science, SC, DDIT, DDU Applications of NLP 22/59

Deep (Semantic) Transfer

Semanti Semanti Semanti
broke(Abebe, c
broke(Abebe,
c c
window) Transfe window)
Structur Structur
e r e

Syntacti Syntacti Syntacti

c c c
Structu Transfe Structu
r
re re

Syntacti Lexica Syntacti

c l c
Structu Transfe Structu
re r re
Source Target text
text
Abebe broke the አበበ መስኮቱን ሰበረው
window

Department of Computer Science, SC, DDIT, DDU Applications of NLP 23/59

Pros and Cons of Transfer-Based Translation

 Pros:

 Offers the ability to deal with more complex source language phenomena than
the direct approach.
 High quality translations can be achieved as compared to direct translation.
 Relatively fast as compared to Interlingual translation.

 Cons:

 O(N2) sets transfer rules in multilingual machine translation.

 Proliferation of language-specific rules in lexicon and syntax.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 24/59

 Interlingual translation uses a language-independent, 'universal', and abstract

representation of the original text in order to generate the target text.
 The process of Interlingual translation involves analysis and generation.

EVENT:
breaking TENSE:
past
AGENT: Abebe
PATIENT:
window
Interlingu
DEFINI
a
Analysi TENES Generatio
s S: n
Source definite Target text
Abebetext
broke the አበበ መስኮቱን ሰበረው
window
 Interlingual translation is suitable for multilingual machine translation, and its main
drawback is that the definition of an Interlingua is difficult and maybe even impossible
for a wider domain.

of Computer Science, Addis Ababa University Applications of NLP 25/59

Analysi Interlingu Generatio

s a n
Conceptual Analysis Conceptual
Semanti
Generation Semanti Semanti
c c c
Structur Transfe Structur
e r e
Semantic Semantic
Analysis Generation
Syntacti
Syntacti c Syntacti
c Transfe c
Structur r Structur
Syntactic e e Syntactic
Analysis Generation
Direc
Word t Word
Structu Structu
re re
Morphological Morphological
Analysis Generation
Source text Target text

Department of Computer Science, SC, DDIT, DDU Applications of NLP 26/59

 Statistical Machine Translation (SMT) finds the most probable target sentence given a
source text sentence.
 Parameters of probabilistic models are derived from the analysis of bilingual text
corpora.

Bilingual Corpus of English and Amharic (Example)

Abebe ate besso. አበበ በሶ በላ።
Abebe bought አበበ በሶ ገዛ።
besso. Abebe threw አበበ ድንጋዩን ወረወረው።
the stone. Abebe አበበ ወደ ትምህርት ቤት ሄደ።
went to school.
ከበደ መኪና ገዛ።
Kebede bought a
ከበደ መኪናውን ገዛው።
car. Kebede bought
ከበደ መኪናዋን ገዛት።
the car. Kebede
bought the car. አልማዝ ሻይ አፈላች።
Almaz made tea. አልማዝ ጠላ ጠነሰሰች።
Almaz made tella.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 27/59

General Architecture of SMTs

Source text

Language Decoding Translation

Model Model

Target text

Department of Computer Science, SC, DDIT, DDU Applications of NLP 28/59

 Language Model tries to ensure that words come in the right order.
 Some notion of grammaticality
 Given an English string e, language model assigns p(e) by formula.
 Good English string  high p(e); and bad English string  low p(e).
 Calculated with:
 A statistical grammar such as a probabilistic context free grammar; or
 An n-gram language model.

Probabilistic context free grammar [example]

Grammar Probability Lexicon

Department of Computer Science, SC, DDIT, DDU Applications of NLP 29/59

 N-gram models can be computed from monolingual corpus.

 Unigram probabilities
አበበ በሶ በላ።
count (w1) 4 አበበ በሶ ገዛ።
p(w1) = total words observed p(አበበ) = 29 =0.138 አበበ ድንጋዩን ወረወረው።
አበበ ወደ ትምህርት ቤት ሄደ።
 Bigram probabilities ከበደ መኪና ገዛ።
count (w1w2) p(በሶ |አበበ) = count(አበበ ከበደ መኪናውን ገዛው።
p(w2|w1) = count (w1) count(አበ ከበደ መኪናዋን ገዛት።
በሶ) በ) አልማዝ ሻይ አፈላች።
= 24 = 0.500 አልማዝ ጠላ ጠነሰሰች።

 Trigram probabilities

count (w1w2w3) p(በላ|አበበ በሶ) count(አበበ በሶ

p(w3|w1w2) = =count(አበበ
count (w1w2)
በላ) በሶ)
= 12 = 0.500

Department of Computer Science, SC, DDIT, DDU Applications of NLP 30/59

 Similarly, higher order n-gram models can be computed.

 Problems:
 How can we deal with n-gram models if sentences are too long?
 What is p(አበበ ወደ ትምህርት ቤት ሄደ እና ድንጋዩን ወረወረው)?
 How can we deal with n-gram models if sentences are not seen in the corpus?
 What is p(ወረወረው | አበበ ወደ ትምህርት ቤት ሄደ እና ድንጋዩን)?
 Solutions:
 Smoothing: avoid zero probability
 For example, p(አበበ ወደ ትምህርት ቤት ሄደ እና ድንጋዩን ወረወረው)=0.00001
 Compute higher order n-gram models using lower order models such as bigram
and trigram.
 p(አበበ ወደ ትምህርት ቤት ሄደ እና ድንጋዩን ወረወረው) = p(አበበ|<s>) * p(ወደ|አበበ)
* p(ትምህርት|ወደ) * p(ቤት|ትምህርት) * p(ሄደ|ቤት) * p(እና|ሄደ) * p(ድንጋዩን|እና)
*p(ወረወረው|ድንጋዩን) * p(</s>|ወረወረው)

Department of Computer Science, SC, DDIT, DDU Applications of NLP 31/59

 The job of the translation model is to assign a probability that a given source language
sentence generates target language sentence.
 We can model the translation from a source language sentence S to a target language
sentence Tˆ as:
best-translation Tˆ = argmax faithfulness(T,S) * fluency(T)
T
 Suppose that we want to build a foreign-to-English machine
translation system.
 Thus, in a probabilistic model, the best English sentence e is the
one whose probability ê = argmax p(e|
f) e
p(e|f) is the highest.
 Bayes’ rule:
p(e|f ) = p(f|e) * p(e) /
p(f )
argmax p(e|f ) = argmax p(f|e) * p(e) /
e
p(f )
e ê = argmax p(f|e) * p(e) [for a given
f] e
 Noisy channel equation: Translation model Language model

Department of Computer Science, SC, DDIT, DDU Applications of NLP 32/59

 We are looking for the e that maximizes p(f|e) * p(e).

 We need to tell a generative “story” about how English strings become
foreign strings, i.e. we want to compute p(f|e) .
 When we see an actual foreign string f, we want to reason backwards ...
what English string e is:
 likely to be uttered; and
 likely to subsequently translate to f ?
 p(f|e) will be a module in overall foreign-to-English machine translation system.
 How do we assign values to p(f|e)?

count (f,e)
p(f |e) = count (e)

 Impossible because sentences are novel, so we would never have enough

data to find values for all sentences.
 For example:
p(Aበበ ወደ ትምህርት ቤት ሄደ እና ድንጋዩን ወረወረው|Abebe went to school and threw
the stone)=?

Department of Computer Science, SC, DDIT, DDU Applications of NLP 33/59

 Decompose the sentences into smaller chunks, like in language modeling.

p(f|e) = p(a, f|e)

 The variable a represents alignments between the individual chunks in the

sentence pair where the chunks in the sentence pair can be words or phrases.
 In word-based translation, the fundamental unit of translation is a word.
 Phrase-based translations translates whole sequences of words (called blocks or
phrases), where the lengths may differ.
 Blocks are not linguistic phrases but phrases found using statistical
methods from corpora.
 Most commonly used form of translation.
 The alignment probability p(a, f|e) is defined as follows:
m

p(a, f| e) = j=1 t(fj|e i) where t(fj|e i) is translation

 probability.
The translation probability t(fj|ei) is calculated by counting as follows:
count (fj , ei )
t(fj | ei )
= count (ei )

Department of Computer Science, SC, DDIT, DDU Applications of NLP 34/59

 Word translation table [example]

አበበ በሶ በላ ።

Abebe

ate

besso

 Unfortunately, it is difficult to get word aligned data to compute word translation

probability, so we can't do this directly.
 Use Expectation-Maximization (EM) algorithm.
 EM algorithm estimates model parameters by
 initiatilizing probabilities (e.g. uniformly);
 iteratively find the maximim likelihood of estimates of parameters.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 35/59

Source text
f

Language Model Decoding Translation Model

p(e) argmax p(f|e) * p(f|e)
e
p(e)

Target text
e

Department of Computer Science, SC, DDIT, DDU Applications of NLP 36/59

 A decoder searches for the best sequence of transformations that translates a source
sentence.
 Look up all translations of every source word or phrase, using word or phrase
translation table.
 Recombine the target language phrases that maximizes the translation model
probability * the language model probability.
 This search over all possible combinations can get very large so we need to find
ways of limiting the search space.
 Decoding is, therefore, a searching problem that can be reformulated as a classic
Artificial Intelligence problem, i.e. searching for the shortest path in an implicit graph.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 37/59

Pros and Cons

 Pros:
 Has a way of dealing with lexical ambiguity
 Can deal with idioms that occur in the training data
 Can be built for any language pair that has enough training data (language
independent)
 No need of language experts (requires minimal human effort)

 Cons:
 Does not explicitly deal with syntax

Department of Computer Science, SC, DDIT, DDU Applications of NLP 38/59

Choosing S M T
 Economic reasons:
 Low cost
 Rapid prototyping
 Practical reasons:
 Many language pairs don't have NLP resources, but do have parallel corpora
 Quality reasons:
 Uses chunks of human translated text as its building blocks
 Produces state of the art results (when very large data sets are available)

Department of Computer Science, SC, DDIT, DDU Applications of NLP 39/59

Materials Needed to Build S M T

 Parallel corpus
 For example, Negarit Gazette (ነጋሪት ጋዜጣ) is a useful resource for English-to-
Amharic machine translation or vice versa.

 Word alignment software

 For example, Giza++ is useful for testing the quality of automatically generated
word alignments.

 Language modeling toolkit

 For example, SRILM toolkit estimates n-gram probabilities.

 Decoder
 For example, Pharaoh (phrase-based decoder that builds phrase tables from
Giza++ word alignments and produces best translation for new input using the
phrase table plus SRILM language model)

Department of Computer Science, SC, DDIT, DDU Applications of NLP 40/59

 Fundamental idea:
 People do not translate by doing deep linguistics analysis of a sentence.
 They translate by decomposing sentence into fragments, translating each of
those, and then composing those properly.
 Uses the principle of analogy in translation
 Example:
Given the following translations:

የመጽሃፉ ዋጋ ከ500 ብር በላይ ነው  The price of the is more than 5 0 0

book Birr
የቤቱ ዋጋ ርካሽ ነው  The price of the is
house cheap
Based on the above examples, the following translation can be made:

የቤቱ ዋጋ ከ500 ብር በላይ ነው  The price of the is more than 5 0 0

house Birr

Department of Computer Science, SC, DDIT, DDU Applications of NLP 41/59

Information Applications and Approaches
Retrieval Rule-Based Machine Translation
Statistical Machine Translation
Information Example-Based Machine Translation
Extraction Machine Hybrid Approaches to Machine Translation
Translation Deep Learning approach to Machine Translation
Question-Answering and Dialogue
Example-Based Machine S ystem s
Text Summarization
Translation
Challenges
 Locating similar sentences
 Aligning sub-sentential fragments
 Combining multiple fragments of example translations into a single sentence
 Determining when it is appropriate to substitute one fragment for another
 Selecting the best translation out of many candidates

Pros and Cons

 Pros:
 Uses fragments of human translations which can result in higher quality

 Cons:
 May have limited coverage depending on the size of the example database, and
flexibility of matching heuristics

Department of Computer Science, SC, DDIT, DDU Applications of NLP 42/59

Information Applications and Approaches
Retrieval Rule-Based Machine Translation
Statistical Machine Translation
Information Example-Based Machine Translation
Extraction Machine Hybrid Approaches to Machine Translation
Translation Deep Learning approach to Machine Translation
Question-Answering and Dialogue
Hybrid Approaches to Machine S ystem s
Text Summarization
Translation
 Machine translation systems discussed so far have their own pros and cons.
 Hybrid systems take the synergy effect of rule-based, statistical and example-based
machine translations.
 Rules can be post-processed by statistics and/or examples
 Statistics guided by rules and/or examples
 Example:

If you prefer another please let me know

hotel Segment1 Segment2

Example-based Statistical Rule-based

translation translation translation
Alternative translations with confidence values
Selection module

Segment1 Segment2
Translated by rule-based Translated by example-based

Department of Computer Science, SC, DDIT, DDU Applications of NLP 43/59

 The application of deep learning approaches for machine translation is called Neural
Machine translation.
 Neural machine translation (NMT) is an approach to machine translation that uses
an artificial neural network to predict the likelihood of a sequence of words, typically
modeling entire sentences in a single integrated model.
 NMT is not a drastic step beyond what has been traditionally done in statistical machine
translation.
 Its main departure is the use of vector representations ("embeddings", "continuous
space representations") for words and internal states. The structure of the models is
simpler than phrase-based models.
 There is no separate language model, translation model, and reordering model, but just
a single sequence model that predicts one word at a time. However, this sequence
prediction is conditioned on the entire source sentence and the entire already produced
target sequence.
 NMT: it is simple architecture and ability in capturing long dependency in the sentence,
which indicates a huge potential in becoming a new trend of the mainstream.
 NMT needs less linguistic knowledge than other approaches but can produce a
competitive performance.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 44/59

 Motivation of NMT: The inspiration for neural machine translation comes from two
aspects: the success of Deep Learning in other NLP tasks as we mentioned, and the
unresolved problems in the development of MT itself.
 NMT task is originally designed as an end-to-end learning task. It directly processes a
source sequence to a target sequence. The learning objective is to find the correct
target sequence given the source sequence, which can be seen as a high dimensional
classification problem that tries to map the two sentences in the semantic space.

End-to-End structure in modern NMT model.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 45/59

 End-to-End means the model processes source data to target data directly, without
explicable intermediate result.
 End-to-End architecture of NMT contains four components.
 Source Sentence: is an input sentence for the models from the source natural
language.
 Encoder - is used to represent the source sentence to semantic vector.
 Decoder - makes prediction from this semantic vector to a target sentence.
 Target Sentence: is an output sentence from the model for the target natural
language.
 The common deep learning algorithm used for encoding and decoding in NMT are:
 Recurrent Neural networks such as Conventional RNN, LSTM, GRU, BRNN, BLSTM,
BGRU, these models with attention mechanism and Full attention mechanism
models.
 All the above models are used to model the word sequence of the source and target
natural languages.
 The encoder model is the first model that is used by the neural network to encode a
source sentence for a second model, known as a decoder.
 The decoder model is used to predict words in the target natural language.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 46/59

 Recurrent neural networks face difficulties in encoding long inputs into a single vector.

 This can be compensated by an attention mechanism which allows the decoder to focus
on different parts of the input while generating each word of the output.
 The common approaches of NMT are:
 NMT with Deep Learning
 NMT with Deep Learning and Attention Mechanism
 Full Attention based NMT e.g. Transformer

 The benefits of NMT compared with SMT:

 It is the newest method of MT and is said to create much more accurate
translations than SMT.
 It is based on the model of neural networks in the human brain, with information
being sent to different “layers” to be processed before output.
 It uses deep learning techniques to teach itself to translate text based on existing
statistical models. It makes for faster translations than the statistical method and
has the ability to create higher quality output.
 It is able to use algorithms to learn linguistic rules on its own from statistical
models. The biggest benefit to NMT is its speed and quality.
 It is said by many to be the way of the future, and the process will no doubt
continue to advance in its capabilities.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 47/59

Department of Computer Science, SC, DDIT, DDU Applications of NLP 48/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
NL Interfaces for Human-Machine S ystem s
Text Summarization
Interaction

 Natural Languages (NLs) are increasingly becoming important interfaces styles in

Human-Computer Interaction (HCI).
 The growing popularity of natural language interfaces is due to the rise of human needs
to interact/communicate with computer systems to:
 get answers for real world questions; or
 make conversation in a coherent way about various topics.
 Two of the most important applications of NLP that deal with such issues are Question
Answering and Dialogue Systems.
 Question Answering (QA) System:
 A system that provides an answer or answer containing text for a given question
formulated using natural language.
 Dialogue System (DS):
 A system that converses with human beings in a coherent way.
 An extension of QA system, i.e., a two-way QA system.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 49/59

Question
Answer
Question Question
Answer Answer

HCI in Question Answering HCI in Dialogue Systems

Department of Computer Science, SC, DDIT, DDU Applications of NLP 50/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Question Answering Systems: S ystem s Questions and
Text Summarization
Answers

 QA Systems deal with a wide range of question types such as fact, list, “wh”-questions,
definition, hypothetical, and semantically-constrained questions.
 Search engines do not speak natural language.
 Human beings need to speak the language of search engines.
 QA Systems attempt to let human beings ask their questions in the normal way
using natural languages.
 QA Systems are important NLP applications especially for inexperienced users.
 QA Systems are closer to human beings than search engines are.
 QA Systems are viewed as natural language search engines.
 QA Systems are considered as next step to current search engines.
 Question answering can be approached from one of two existing NLP research areas:
 Information Retrieval: QA can be viewed as short passage retrieval.
 Information Extraction: QA can be viewed as open-domain information
extraction.
 The performance of QA Systems is heavily dependent on good search corpus.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 51/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Question Answering Systems: S ystem s Questions and
Text Summarization
Answers

 Answers are searched in collections from

 Database (e.g. documents in an organization):
 relies on small local document collections
 limited type of questions are allowed (e.g.
factoid questions)
 key words are used to represent the
question
 only shallow analysis is made to compile
answers
 closed-domain question answering
 Corpus data (e.g. web):
 relies on world knowledge
 entertains any type of question
 deep linguistic analysis is required (e.g. named entity recognition, relation
detection, co-reference resolution, word sense disambiguation, etc.
 open-domain question answering

Department of Computer Science, SC, DDIT, DDU Applications of NLP 52/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Question Answering Systems: S ystem s General Architecture
Text Summarization

Major Components of Q A S ystems

 Question Analysis: The natural language question input by the user needs to be
analyzed into whatever form or forms are needed by subsequent parts of the system.
 The user could be asked to clarify his or her question before proceeding.
 Candidate Document Selection: A subset of documents from the total document
collection (typically several orders of magnitude smaller) is selected, comprising those
documents deemed most likely to contain an answer to the question.
 This collection may need to be processed before querying, in order to transform
it into a form which is appropriate for real-time question answering.
 Candidate Document Analysis: If the preprocessing stage has only superficially
analyzed the documents in the document collection, then additional detailed analysis of
the candidates selected at the preceding stage may be carried out.
 Answer Extraction: Using the appropriate representation of the question and of each
candidate document, candidate answers are extracted from the documents and ranked
in terms of probable correctness.
 Response Generation: A response is returned to the user.
 This may be affected by the clarification request, and may in turn lead to the
response being updated.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 53/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Question Answering Systems: S ystem s General
Text Summarization
Architecture

User
Question Response

Clarification Request

Question Analysis Response

Generation

Question Representation Ranked Answers

Candidate Candidate Candidate Analyzed

Documents Answer
Document Documents Document
Extraction
Selection Analysis

Docume
nt
Collectio
n

Department of Computer Science, SC, DDIT, DDU Applications of NLP 54/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Question Answering Systems: S ystem s Deep Learning Approach
Text Summarization

Deep Learning approaches to Question and Answer system

 QA system developed using traditional approach used conventional linguistically-based NLP
techniques, such as parsing, part-of-speech tagging and coreference resolution.
 However, with recent developments in deep learning, neural network models have shown
promise for QA. Although these systems generally involve a smaller learning pipeline, they
require a significant amount of training.
 Deep learning based question and Answer systems uses :
 GRU and LSTM units allow recurrent neural networks to handle the longer texts
required for QA.
 Further improvements – such as attention mechanisms and memory networks –
allow the network to focus on the most relevant facts. Such networks provide the
current state-of-the-art performance for deep-learning-based QA.
 Common models developed for question and answer using deep learning are:
 Sequence-to-sequence model:
 Dynamic memory networks
 End-to-end memory networks

Department of Computer Science, SC, DDIT, DDU Applications of NLP 55/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: Modality of
S ystem s
Text Summarization
Conversation

 The modality of Dialogue Systems can be text-based, spoken-dialogue, graphical user

interface, or multi-modal.
 Text-Based:
 The conversation is made by making use of natural language texts.
 For example, ELIZA.
 Spoken Dialogue:
 The conversation is made by making use of voice.
 For example, HMIHY (how may I help you) developed at AT&T for call routing.
 Graphical User Interface:
 The conversation is made by making use of images.
 For example, Dialogue Boxes in Windows applications.
 Multimodal:
 The conversation is made by any combination of the above three modalities.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 56/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: Dialogue Interation
S ystem s
Text Summarization
M od es

 Dialogue Systems differ in the degree with which human or computer takes the
initiative.

Question
Answer
Question
Answer

Computer-Initiative Human-Initiative
 Computer maintains tight control  Human maintains tight control
 Human is highly restricted  Computer is highly restricted
 E.g., Dialogue Boxes  E.g., ELIZA
Mixed-Initiative
 Human and computer have flexibility to specify constraints
 Mainly research prototypes

Department of Computer Science, SC, DDIT, DDU Applications of NLP 57/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: Application
S ystem s
Text Summarization
Areas

 Currently, Dialogue Systems are used in specific domains such as:

 Customer service: Responding to customers' general questions about products
and services, e.g., answering questions about applying for a bank loan.
 Help desk: Responding to internal employee questions, e.g., responding to
human resource questions.
 Website navigation: Guiding customers to relevant portions of complex
websites, e.g., helping people determine where information or services reside
on a company's website.
 Guided selling: Providing answers and guidance in the sales process,
particularly for complex products being sold to novice customers.
 Technical support: Responding to technical problems, such as diagnosing a
problem with a device.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 58/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: General
S ystem s
Text Summarization
Architecture

General Architecture of Spoken Dialogue System

Human Computer System

Input Speech Natural

Interfac Recognitio Language
e n Understandin
g
I/O
Serve Dialogu
Knowled
r e
ge
Manage
Base
r
Output Text-to- Natural
Speech Language
Interfac Synthesis Generation
e
These components do not exist in Text-Based Dialogue Systems.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 59/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: General
S ystem s
Text Summarization
Architecture

 In the process of Natural Language Understanding, there are many ways to represent
the meaning of sentences.
 For dialogue systems, the most common is “frame and slot semantics”
representation.

Show me morning flights from Addis Ababa to London on Tuesday [Example]

SHOW:
FLIGHTS:
ORIGIN
CITY: Addis
Ababa DATE:
Tuesday TIME:
morning
DESTINATION
CITY:
London
DATE:
TIME:

Department of Computer Science, SC, DDIT, DDU Applications of NLP 60/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: General
S ystem s
Text Summarization
Architecture

 “Frame and slot semantics” can be generated by a semantic grammar.

Semantic grammar for the aforementioned request

LIST  show me | I want | can I see

Department of Computer Science, SC, DDIT, DDU Applications of NLP 61/59

Information
Retrieval NL Interfaces for Human-Machine
Information Interaction Question Answering
Extraction Machine S ystem s
Translation Dialogue S ystem s
Question-Answering and Dialogue
Dialogue Systems: Using Deep Learning
S ystem s
Text Summarization

 The General Architecture of Spoken Dialogue System contain different components

including:
 Speech Recognition
 Natural Language Understanding

 Domain Identification
 User Intent Detection
 Slot Filling
 Dialogue Manager

 Dialogue State Tracking (DST)

 Dialogue Policy Optimization
 Natural Language Generation
 Text-to-Speech Synthesis

 Deep Learning algorithms are applied to all these components of the Dialogue System,
and achieved a state of the art result.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 62/59

Information
Retrieval
Genres and Types of Text Summarization
Information
Computational Approach
Extraction Machine
Translation Deep Learning Approach to Text Summarization
Question-Answering and Dialogue
Genres and Types of Text S ystem s
Text Summarization
Summarization
 Automatic Text Summarization refers to the generation of a shortened version of a text
that still contains the most important points of the original text.
 The content of the summary depends on the purpose and/or type of summarization.
 Types of summaries can be the following:
 Indicative vs. informative
 Used for quick categorization vs. content processing.
 Extract vs. abstract
 Lists fragments of text vs. re-phrases content coherently.
 Generic vs. query-oriented
 Provides author’s view vs. reflects user’s interest.
 Background vs. just-the-news
 Assumes reader’s prior knowledge is poor vs. up-to-date.
 Single-document vs. multi-document source
 Based on one text vs. fuses together many texts.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 63/59

 The following general procedure used to build Text Summarization system:

 Given a corpus of documents and their summaries

 Label each sentence in the document as
summary-worthy or not compute document
keywords
score document sentences with respect to these
keywords
 Learn which sentences are likely to be included in a
summary
 Given an unseen (test document) classify sentences as
summary-worthy or not
Cohesion and coherence check: Spot anaphoric references and modify
text accordingly Balance and coverage: modify summary to have an
appropriate text structure

 There are two computational approaches to Text Summarization:

Department
 of Computer Science, SC, DDIT, DDU Applications of NLP 64/59
top-down; and
Information
Retrieval
Genres and Types of Text Summarization
Information
Computational Approach
Extraction Machine
Translation Deep Learning Approach to Text Summarization
Question-Answering and Dialogue
Computational Approach: Top-
S ystem s
Text Summarization
Down

 Top-down approach is considered as query-driven summarization.

 User needs only certain types of information.
 System needs particular criteria of interest, used to focus search.
 Criteria of interest can be modeled using:
 Templates with slots having semantic characteristics; or
 A set of important terms.
 Top-down approach can be implemented using Information Extraction (IE).
 IE task: Given a template and a text, find all the information relevant to each
slot of the template and fill it in.
 IE-for-summarization task: Given a query, select the best template, fill it in,
and generate the contents.
 Problems:
 IE works only for very particular templates; can it scale up?
 What about information that doesn’t fit into any template?
 Pros: higher quality, and also supports abstracting.
 Cons: low speed, and still needs to scale up to robust open-domain summarization.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 65/59

 Bottom-up approach is considered as text-driven summarization.

 User needs any information that is important.
 System uses strategies (importance metrics) over representation of whole text.
 Generic importance metrics can be modeled using:
 Degree of connectedness in semantic graphs; or
 Frequency of occurrence of tokens.
 Bottom-up approach can be implemented using Information Retrieval (IR).
 IR task: Given a query, find the relevant document(s) from a large set of
documents.
 IR-for-summarization task: Given a query, find the relevant passage(s)
from a
set of passages (i.e., from one or more
documents).
 Problems:
 IR techniques work on large volumes of data; can they scale down?
 IR works on words; do abstracts require abstract representations?
Cons: lower quality, and inability to manipulate information at abstract levels.
 Pros: robust, and good for query-oriented summaries.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 66/59

 Given the original text T and summary S, two measures are commonly used to
evaluate Text Summarization systems:

 Compression Ratio (CR)

Length (S)
CR =
Length (T)
 Retention Ratio (RR)

Information (S)
RR =
Information (T)
 Measuring length:
 Number of letters
 Number of words
 Number of sentences
 Measuring information:
 Question
 Shannon Game: test reader’s
quantify understanding.
information content.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 67/59

 Text Summarization is a sequence to sequence problem.

 Sequence-to-sequence learning is a training model that can convert sequences of one
input domain into the sequences of another output domain.
 It is generally used when the input and output of a model can be of variable lengths.

 Variants of Recurrent Neural Networks (RNNs), i.e. Gated Recurrent Neural Network (GRU)
or Long Short Term Memory (LSTM), are preferred as the encoder and decoder
components. This is because they are capable of capturing long term dependencies by
overcoming the problem of vanishing gradient.

 The general architecture of the end-to-end for text summarization is as follow.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 68/59

Department of Computer Science, SC, DDIT, DDU Applications of NLP 69/59

 End-to-end or Sequence-to-sequence model with an attention mechanism consists of

encoder, decoder and an attention layer.

 Word embeddings are a type of word representation that allows words with similar
meaning to have a similar representation.

 Attention mechanism is used to secure individual parts of the input which are more
important at that particular time.
 It can be implemented by taking inputs from each time steps and giving
weightage to time steps.
 The weightage depends on the contextual importance of that particular time
step.
 It helps pay attention to the most relevant parts of the input data sequence so
that the decoder can optimally generate the next word in the output sequence.

Department of Computer Science, SC, DDIT, DDU Applications of NLP 70/59

 Precision = Number of important sentences/Total number of sentences summarized.

 Recall = Total number of important sentences Retrieved / Total number of important

sentences present.

 BLEU - measures precision - how much the words (and/or n-grams) in the machine
generated summaries appeared in the human reference summaries.

 ROUGE - measures recall - how much the words (and/or n-grams) in the human
reference summaries appeared in the machine generated summaries.


F1-score - F1 = 2 * (Bleu * Rouge) / (Bleu + Rouge)

Department of Computer Science, SC, DDIT, DDU Applications of NLP 71/59

TOC Course Syllabus
:

Previous Approaches to NLP

:
Applications of NLP-Part-I
Current
Applications of NLP-Part- II
: Next:

Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
61 pages
Unit - 3 NLP
No ratings yet
Unit - 3 NLP
15 pages
RAG Beyond Text Enhancing Image Retrieval in RAG Systems
100% (1)
RAG Beyond Text Enhancing Image Retrieval in RAG Systems
6 pages
Dr. TV. Geetha
No ratings yet
Dr. TV. Geetha
176 pages
Ccs369-Unit 3
No ratings yet
Ccs369-Unit 3
28 pages
Ir - Chapter 1
No ratings yet
Ir - Chapter 1
7 pages
Unit Iii - Information Retrieval Design Features of Information Retrieval Systems
No ratings yet
Unit Iii - Information Retrieval Design Features of Information Retrieval Systems
57 pages
Natural Language Processing
100% (6)
Natural Language Processing
49 pages
NLP Syllabus R21
100% (1)
NLP Syllabus R21
2 pages
Thesis
No ratings yet
Thesis
154 pages
Applications of NLP
No ratings yet
Applications of NLP
85 pages
Tsion Adisu Final Thesis1
No ratings yet
Tsion Adisu Final Thesis1
88 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
Unit-5 Ai
No ratings yet
Unit-5 Ai
74 pages
Advances On P2p Parallel Grid Cloud And Internet Computing Proceedings Of The 17th International Conference On P2p Parallel Grid Cloud And Internet Computing 3pgcic2022 1st Ed 2023 Leonard Barolli download
No ratings yet
Advances On P2p Parallel Grid Cloud And Internet Computing Proceedings Of The 17th International Conference On P2p Parallel Grid Cloud And Internet Computing 3pgcic2022 1st Ed 2023 Leonard Barolli download
69 pages
Chapter 1
No ratings yet
Chapter 1
52 pages
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
No ratings yet
Monday - IR Fundamentals - Grace Yang - AFIRM19-IR
77 pages
Unit-1-Natural Language Processing Applications
No ratings yet
Unit-1-Natural Language Processing Applications
63 pages
Chapter 8 - Applications of NLP
No ratings yet
Chapter 8 - Applications of NLP
72 pages
Lecture 07
No ratings yet
Lecture 07
59 pages
Lecture 07
No ratings yet
Lecture 07
59 pages
NLP M5 Part-1 SPP
No ratings yet
NLP M5 Part-1 SPP
55 pages
Applications of NLP: Introduction To Natural Language Processing (CSE 5321)
No ratings yet
Applications of NLP: Introduction To Natural Language Processing (CSE 5321)
59 pages
Applications of NLP
No ratings yet
Applications of NLP
48 pages
Information Retrieval - 1
No ratings yet
Information Retrieval - 1
47 pages
Ch2 - IR and LT
No ratings yet
Ch2 - IR and LT
45 pages
Chapter #7 Applicatios of NLP (Reading Ass)
No ratings yet
Chapter #7 Applicatios of NLP (Reading Ass)
58 pages
Large Language Models Encode Clinical Knowledge: Google Research, Deepmind
No ratings yet
Large Language Models Encode Clinical Knowledge: Google Research, Deepmind
44 pages
IR-Module 1 and 2
No ratings yet
IR-Module 1 and 2
48 pages
Unit1 Introduction
No ratings yet
Unit1 Introduction
31 pages
1 Overview
No ratings yet
1 Overview
44 pages
1stunit GN
No ratings yet
1stunit GN
36 pages
1 IR Introductionn
No ratings yet
1 IR Introductionn
30 pages
11 - Question Answering Systems
No ratings yet
11 - Question Answering Systems
34 pages
AI-Unit 5
No ratings yet
AI-Unit 5
32 pages
Chap 1
No ratings yet
Chap 1
23 pages
Lecture1 Chap1
No ratings yet
Lecture1 Chap1
22 pages
1.introduction Information Retrival
No ratings yet
1.introduction Information Retrival
31 pages
1 IR Introduction
No ratings yet
1 IR Introduction
23 pages
Module 6 Updated Final
No ratings yet
Module 6 Updated Final
48 pages
UNIT I IR Final
No ratings yet
UNIT I IR Final
26 pages
Introduction To Information Retrieval
No ratings yet
Introduction To Information Retrieval
50 pages
Application NLP
No ratings yet
Application NLP
23 pages
Natural Language Processing (NLP) (A Complete Guide)
No ratings yet
Natural Language Processing (NLP) (A Complete Guide)
26 pages
Chapter One IR
No ratings yet
Chapter One IR
18 pages
Recognizing Text Entailment - Tutorial
No ratings yet
Recognizing Text Entailment - Tutorial
74 pages
RAFT: Adapting Language Model To Domain Specific RAG: Vu Et Al. 2023 Lazaridou Et Al. 2022
No ratings yet
RAFT: Adapting Language Model To Domain Specific RAG: Vu Et Al. 2023 Lazaridou Et Al. 2022
11 pages
Information Retreival Methods
No ratings yet
Information Retreival Methods
19 pages
Information Retrival List of Experiment - Odd Sem 2024-25
No ratings yet
Information Retrival List of Experiment - Odd Sem 2024-25
23 pages
Unit Ii Part B 1. Write About Basic IR Model
No ratings yet
Unit Ii Part B 1. Write About Basic IR Model
17 pages
Question Answering, Information Retrieval, and Retrieval Augmented Generation
No ratings yet
Question Answering, Information Retrieval, and Retrieval Augmented Generation
22 pages
1 IR Chapter-One
No ratings yet
1 IR Chapter-One
47 pages
01rag For LLM A Survey
No ratings yet
01rag For LLM A Survey
21 pages
Retrieving and Reading - A Comprehensive Survey On Open-Domain Question Answering
No ratings yet
Retrieving and Reading - A Comprehensive Survey On Open-Domain Question Answering
21 pages
Unit 1: Introduction and Data Pre-Processing
No ratings yet
Unit 1: Introduction and Data Pre-Processing
71 pages
Downloaded From: Https://ray - Yorksj.ac - Uk/id/eprint/9863/: Institutional Repository Policy Statement
No ratings yet
Downloaded From: Https://ray - Yorksj.ac - Uk/id/eprint/9863/: Institutional Repository Policy Statement
18 pages
How Does NLP Benefit Legal System - A Summary of Legal Artificial Intelligence
No ratings yet
How Does NLP Benefit Legal System - A Summary of Legal Artificial Intelligence
13 pages
Opportunities in Natural Language Processing
No ratings yet
Opportunities in Natural Language Processing
60 pages
NLP Module 6
No ratings yet
NLP Module 6
30 pages
Unit-4 NLP
No ratings yet
Unit-4 NLP
21 pages
Introduction To NLP 2021
No ratings yet
Introduction To NLP 2021
13 pages
SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach To Question Answering
No ratings yet
SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach To Question Answering
10 pages
20200728204914D5872 - COMP6639 - Session 28 - Natural Language Processing
No ratings yet
20200728204914D5872 - COMP6639 - Session 28 - Natural Language Processing
29 pages
1 introIR
No ratings yet
1 introIR
15 pages
NLP Unleashed Transforming Communication & Insights
No ratings yet
NLP Unleashed Transforming Communication & Insights
12 pages
S41467-022-30761-2-Towards Artificial General Intelligence Via A Multimodal Foundation Model
No ratings yet
S41467-022-30761-2-Towards Artificial General Intelligence Via A Multimodal Foundation Model
13 pages
What Is Information Retrieval (IR)
No ratings yet
What Is Information Retrieval (IR)
15 pages
Intro Notes
No ratings yet
Intro Notes
11 pages
Information Retrieval: DR Sharifullah Khan Nust Seecs
No ratings yet
Information Retrieval: DR Sharifullah Khan Nust Seecs
32 pages
Nazia Chilimattur (ME204) IR
No ratings yet
Nazia Chilimattur (ME204) IR
10 pages
Haramaya University: College of Computing and Informatics Department of Computer Science
No ratings yet
Haramaya University: College of Computing and Informatics Department of Computer Science
22 pages
Chapter 6 NLP
No ratings yet
Chapter 6 NLP
16 pages
NLP-Based Query-Answering System For Information Extraction From Building Information Models
No ratings yet
NLP-Based Query-Answering System For Information Extraction From Building Information Models
11 pages
Unit 5 AI
No ratings yet
Unit 5 AI
9 pages
Artificial Intelligence in Information Retrieval
No ratings yet
Artificial Intelligence in Information Retrieval
5 pages
Module 3-2
No ratings yet
Module 3-2
17 pages
Information Retrieval and Web Search
No ratings yet
Information Retrieval and Web Search
29 pages
1 s2.0 S2666651021000176 Main
No ratings yet
1 s2.0 S2666651021000176 Main
6 pages
NLTK 3
No ratings yet
NLTK 3
5 pages
Unit V Notes Adbt Adbt
No ratings yet
Unit V Notes Adbt Adbt
7 pages
Unit 5 6 Pages Notes
No ratings yet
Unit 5 6 Pages Notes
3 pages
Retrieval-Augmented Generation (RAG)
No ratings yet
Retrieval-Augmented Generation (RAG)
2 pages
Detailed IR and NLP Answers
No ratings yet
Detailed IR and NLP Answers
3 pages
40 Common Interview Questions
100% (1)
40 Common Interview Questions
34 pages
Lang Models: 04 December 2024 23:03
No ratings yet
Lang Models: 04 December 2024 23:03
4 pages
How To Use ChatGPT To Write Better, Faster and More Effectively
92% (12)
How To Use ChatGPT To Write Better, Faster and More Effectively
20 pages
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Seid Muhie
No ratings yet
Seid Muhie
128 pages