NLP Assignment

Uploaded by

birenmalik99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views8 pages

NLP Assignment

Uploaded by

birenmalik99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Paper Code: PEC-CSE702A

Paper Name: Speech & Natural Language Processing

Assignment List

To be solved for better understanding of the subject.

Very Short Questions
What is the purpose of grammar-based language model?
What do you understand by statistical language modelling?
Define regular grammar.
Define finite state automaton.
State the difference between deterministic finite automata and non-deterministic finite
automata.
What can be done in morphological parsing?
What are stems?
What are called affixes?
What is lexicon?
What do you understand by morphotactics?
What is the minimum edit distance?
What do you understand by similarity key techniques?
What do you understand by n-gram based techniques?
What do you understand by rule-based techniques?
What is Part-of-speech tagging process?
Write a regular expression to represent a set of all strings over {a, b} of even length.
Write a regular expression to represent a set of all strings over {a, b} of length 4. starting
with an a.
Write a regular expression to represent a set of all strings over {a, b} containing at least one
a.
Write a regular expression to represent a set of strings over {a, b} having exactly 3b's.
Write a regular expression to represent a set of all strings over {a, b} having abab as a
substring.
Write a regular expression to represent a set of all strings containing exactly 2a’s.
Write a regular expression to represent a set of all strings containing at least 2a’s.
Write a regular expression to represent a set of all strings containing at most 2a's.
Write a regular expression to represent a set of all strings containing the substring aa.
Write a regular expression to represent all strings over {a, b} starting with any number of
a's, followed by one or more b's, followed by one or more a's, followed by a single b, followed
by any number of a's, followed by band ending in any string of a's and b's.
What is pragmatic ambiguity?
What do you understand by the term “word sense disambiguation”?
What is corpus?
What do you understand by supervised learning?
What do you understand by unsupervised learning?
How KNN works?
What is the speciality of naïve bayes classifier?
What is the purpose of rule-based taggers?
What is the purpose of stochastic taggers?
What is the purpose of hybrid taggers?
Define context-free grammar.
What is the purpose of parse tree?
When a grammar is called ambiguous?
State the difference between top-down parsing and bottom-up parsing.
What is lexical ambiguity?
What is syntactic ambiguity?
What is semantic ambiguity?
Find the regular expression representing the set of all strings of the form ambncp where m, n,
p >= 1.
Find the regular expression representing the set of all strings of the form amb2nc3p where m,
n, p >= 1.
Find the regular expression representing the set of all strings of the form an ba2mb2 where
m>=0, n>= 1.

What are the main challenges in NLP?

Mention some areas of NLP application.

What is machine translation?
What is morphological segmentation?
What is tokenization?
“I saw bats” – contains which type of ambiguity?
“Rina loves her mother and Tina does too” - contains which type of ambiguity?
How many ambiguities exist in the given sentence “ I know little Italian.”
At which phase of language processing parse tree is required?
At which phase meaning of the word is checked?
What type of words called “stop word”?
What is the use of TF-IDF?
What is the difference between formal and natural languages?
What do you understand by information extraction?

What are the various models of information extraction?

How can you differentiate Artificial Intelligence, Machine Learning, and Natural Language
Processing?
What is noise removal in NLP?
What are the stages in the lifecycle of a natural language processing (NLP) project?

What is meant by data augmentation?

How can data be obtained for NLP projects?
What do you mean by Text Extraction and Clean-up?

What is the meaning of Text Normalization in NLP?

What are the steps to follow when building a text classification system?
What do you mean by a Bag of Words?

Which step is required as pre-processing step in language processing?

What is unigram? Explain with example.
What is bigram? Explain with example
How unigram and bigram differs?
How n-gram differs from unigram and bigram?
Describe F1 score.
How regular grammar and context free grammar differs?
What is term frequency?
What is inverse document frequency?
Why term weighting is important?
What is the purpose of Jaccard’s coefficient?
Short Questions:
Briefly discuss the meaning components of a language.
Differentiate between the rationalist and empiricist approaches to natural language
processing.
List the motivation behind the development of computational models of languages.
Give the representation of a sentence in d-structure and s-structure in GB.
Compare GB and PG. Why is PG called syntactico-semantic theory?
What are the problems associated with n-gram model? How are these problems handled?
What are lexical rules? Give the complete entry for a verb in the lexicon, to be used in LFG.
Define a finite automaton that accepts the following language: (aa)*(bb)*.
Compute the minimum edit distance between paecful and peaceful.
Construct a finite automaton for the regular expression (a+b)*abb.
Construct a nondeterministic finite automaton accepting {ab, ba}, and use it to find a
deterministic automaton accepting the same set.
Construct a nondeterministic finite automaton accepting the set of all strings over {a, b}
ending in aba. Use it to construct DFA accepting the same set of strings.
Construct a transition system which can accept strings over the alphabet a, b,….containing
either cat or rat.
Find all strings of length 5 or less in the regular set represented by the following regular
expressions:
(a) (ab + a)*(aa + b)
(b) (a*b + b*a)*a
Describe, in the English language, the sets represented by the following regular expressions:
(a) a(a+b)*ab
(b) (aa+b)*(bb+a)*
Construct a finite automaton accepting all strings over {0, 1} ending in 010 or 0010.
Find a derivation tree a*b + a*b of a grammar L(G), where G is given by S S + S|S*S,
S a|b.
A context-free grammar G has the following productions:
S0S0 l 1S1 I A, A2B3. B 2B3|3 Describe the language generated by the parameters.
Consider the following productions:
S  aB | bA
A  aS | bAA | a
B  bS | aBB | b
For the string aaabbabbba, find
(a) the leftmost derivation,
(b) the rightmost derivation,
and (c) the parse tree.
Show that the grammar S a |abSb | aAb, A  bS |aAAb is ambiguous.
Show that the grammar S  aBb, A  aAB |a, B  ABb| b is ambiguous.
Give two possible parse trees for the sentence, Stolen painting found by tree.
What are the advantages and disadvantages of using semantic grammar?
Discuss lexical ambiguity.
Discuss syntactic ambiguity.
Discuss semantic ambiguity.
Discuss pragmatic ambiguity.
Discuss knowledge-based word sense disambiguation approaches.
Discuss corpus-based word sense disambiguation approaches.
Discuss Lesk’s algorithm.
Discuss Walker’s algorithm.
Discuss the concept of Bayesian classification.
Discuss the concept of Naïve Bayesian classification.
Discuss k-nearest neighbour algorithm with example.
Use systemic grammar to analyse to handle the following sentences:
(a) Savitha will sing a song.
(b) Savitha sings a song.
Use the FUF grammar to build a fully unified FD for the following sentences:
(a) Savitha will sing a song.
(b) Savitha sings a song.
Discuss the concept of precision and recall and their use.
Discuss the basic information retrieval process.
Discuss Zipf’s law.
Discuss term weighting.
Long Answer type questions:
What is the role of transformational rules in transformational grammar? Explain with the
help of examples.
Write regular expressions for the following languages.
1. the set of all alphabetic strings;
2. the set of all lower-case alphabetic strings ending in a b;
3. the set of all strings from the alphabet a,b such that each a is immediately preceded by and
immediately followed by a b;
Write regular expressions for the following languages. By “word”, we mean an alphabetic
string separated from other words by whitespace, any relevant punctuation, line breaks, and
so forth.
1. the set of all strings with two consecutive repeated words (e.g., “Humbert Humbert” and
“the the” but not “the bug” or “the big bug”);
2. all strings that start at the beginning of the line with an integer and that end at the end of
the line with a word;
3. all strings that have both the word grotto and the word raven in them (but not, e.g., words
like grottos that merely contain the word grotto);
4. write a pattern that places the first word of an English sentence in a register. Deal with
punctuation
What are f-structure and C-structure in LFG? Consider the following sentence and explain
both the structure, “She saw stars in the sky”
What are Karaka relations? Explain the difference between Karta and Agent.
Calculate the bigram probability for each of the words of the sentence “I want Chinese
food.”
We are given the following corpus:
<s> I am sam </s>
<s> Sam I am </s>
<s> I am Sam </s>
<s> I do not like green eggs and Sam</s>
Using a bigram language model with add-one smoothing, what is P(Sam | am)? Include <s>
& </s> in your counts just like any other token.
We are given the following corpus:
<s> I am sam </s>
<s> Sam I am </s>
<s> I am Sam </s>
<s> I do not like green eggs and Sam</s>
If we use linear interpolation smoothing between a maximum-likelihood bi-gram model and
a maximum-likelihood unigram model with λ1 = 1/2 and λ2 = 1/2, what is P(Sam|am)?
Include in your counts just like any other token.
How words can be handled in the tagging process?
Comment on the validity of the following statements:
a) Rule-based taggers are non-deterministic
b) Stochastic taggers are language independent
c) Brill’s tagger is a rule-based tagger
Construct a transition system corresponding to the regular expressions (i) (ab + c*)*b (ii) a
+ bb +bab*a
Construct a finite automaton recognizing L(G), where G is the grammar SaS | bA| b and
A  aA | bS | a.
Construct a deterministic finite automaton equivalent to the grammar S aS | bS | aA, A 
bB, BaC, CA
Find a reduced grammar equivalent to the grammar G whose productions are SAB|CA,
BBC|AB, Aa , CaB|b
Construct a reduced grammar equivalent to the grammar
SaAa, ASb | bCC | DaA, Cabb | DD, EaC, D aDA
Let G be SAB, Aa, BC|b, CD, DE and Ea. Eliminate unit productions and get
an equivalent grammar.
Reduce the following grammar G to CNF. G is SaAD, AaB | bAB, Bb and Dd.
Construct a grammar in Greibach normal form to the grammar SAA|a, ASS|b.
Convert the grammar SAB, ABS|b, BSA|a into GNF.
Find a grammar in GNF equivalent to the grammar
E E + T|T, TT*F|F, F(E) | a.
Discuss the disadvantages of the basic top-down parser with the help of an appropriate
example.
Tabulate the sequence of states created by CYK algorithm while parsing, The sun rises in
the east
Use the following grammar:
S NP VP SVP NPDet Noun
NP Noun NPNP PP VP VP NP
VPVerb VPVP PP PPPreposition NP
Give two possible parse tree of the sentence: “Pluck the flower with stick.”
Introduce lexicon rules for words appearing in the sentence. Using these parse trees obtain
maximum likelihood estimates for the grammar rules used in the tree. Calculate probability
of any one parse tree using these estimates.
For each of the following verbs give all the selectional restrictions possible: think, eat, bark,
fly.
Find out the number of senses of ‘still’, ‘bat’ and ‘cricket’ in WordNet.
Given the sentence, The shopkeeper showed a Dell laptop to Suha and she liked it very
much, find possible referents for the pronouns ‘she’ and ‘it’, and give the score of these
referents for the following antecedent indicators: definiteness, referential distance, indicating
verbs, term preference, section heading, non-prepositional noun phrase, collocation.
Identify the coherence relation between the following sentences:
(a) There is a train on Platform 6.
(b) Its destination is New Delhi.
(c) There is another train on Platform 7.
(d) Its destination is Varanasi.
Given the following document:
The oldest Chinese language we know about is on oracle bones. Priests scratched questions
on animal bones and then held the bones in a fire so that they cracked. The places where the
cracks crossed the pictograms were thought to give the answers from the god.
Assume that raw term frequency is used and the stop words are ‘the’, ‘we’, ‘is’, ‘on’, ‘and’,
‘then’ , ‘in’ , ‘a’, ‘so’ ‘that’, ‘they’, ‘were’, ‘to’, ‘where’, ‘but’, ‘only’, ‘out’.
Find the vector representation of the above document. Use porter stemmer for stemming.
Give evidence where the use of WordNet relations for query expansion might have an
adverse effect on the performance of an IR system.
Construct the conceptual graph for the following tagged sentence.
A DT methodology NNP for IN calculating VBG square JJ root NNP.

NLP Practice Problems
No ratings yet
NLP Practice Problems
48 pages
Question Bank For Compiler Design
100% (4)
Question Bank For Compiler Design
14 pages
Formal Languages And Automata Theory
From Everand
Formal Languages And Automata Theory
Ajit Singh
No ratings yet
English Grade 7 Syllabus PDF
100% (2)
English Grade 7 Syllabus PDF
28 pages
Quest NLP
No ratings yet
Quest NLP
13 pages
M.Suhaib Khalid PDF
No ratings yet
M.Suhaib Khalid PDF
10 pages
Computer 2
No ratings yet
Computer 2
13 pages
Question Bank
No ratings yet
Question Bank
6 pages
NLP - Viva - Que & Ans
No ratings yet
NLP - Viva - Que & Ans
15 pages
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
No ratings yet
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
7 pages
CD QB - Prep
No ratings yet
CD QB - Prep
6 pages
Pdf24 Merged
No ratings yet
Pdf24 Merged
260 pages
Question Bank Part A, Part B&C
No ratings yet
Question Bank Part A, Part B&C
15 pages
Automata Theory Answers
No ratings yet
Automata Theory Answers
33 pages
NLP 2K19 MAY CS3EA06-IT3EA06 Natural Language Processing
No ratings yet
NLP 2K19 MAY CS3EA06-IT3EA06 Natural Language Processing
3 pages
NLP Sample QB
No ratings yet
NLP Sample QB
12 pages
FLAT Lords Paper
No ratings yet
FLAT Lords Paper
60 pages
CS 6660 Compiler Design
No ratings yet
CS 6660 Compiler Design
7 pages
CM3060 NLP Mock Exam Oct2021
No ratings yet
CM3060 NLP Mock Exam Oct2021
4 pages
Cs - 402 F-T Mcqs by Vu - Toper
100% (1)
Cs - 402 F-T Mcqs by Vu - Toper
18 pages
4 Syntax Analysis 191
No ratings yet
4 Syntax Analysis 191
2 pages
CD250 Questions
No ratings yet
CD250 Questions
23 pages
Compiler Design
No ratings yet
Compiler Design
14 pages
Cs - 402 F-T Mcqs Solve by Vu - Toper
No ratings yet
Cs - 402 F-T Mcqs Solve by Vu - Toper
20 pages
Cs1352 Principles of Compiler Design
No ratings yet
Cs1352 Principles of Compiler Design
33 pages
Unit Test 2 Answer Key CD
No ratings yet
Unit Test 2 Answer Key CD
3 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
Compiler Design Important Questions Model
No ratings yet
Compiler Design Important Questions Model
14 pages
FLAT Unit 1 August 2023
No ratings yet
FLAT Unit 1 August 2023
69 pages
IA3-CTCD QB - COE-General 20.9.24 New
No ratings yet
IA3-CTCD QB - COE-General 20.9.24 New
19 pages
Theory of Computation Previous Year Questions
No ratings yet
Theory of Computation Previous Year Questions
6 pages
Compiler Design - Sample Questions
No ratings yet
Compiler Design - Sample Questions
7 pages
ATCD Important Questions
No ratings yet
ATCD Important Questions
7 pages
MIT 6.035 Specifying Languages With Regular Expressions and Context-Free Grammars
No ratings yet
MIT 6.035 Specifying Languages With Regular Expressions and Context-Free Grammars
75 pages
Compiler Design Qbank 2023
No ratings yet
Compiler Design Qbank 2023
15 pages
Toc Sem Ans-1
No ratings yet
Toc Sem Ans-1
22 pages
2 3 Marks
No ratings yet
2 3 Marks
24 pages
Toc 2
No ratings yet
Toc 2
8 pages
Automata X Compiler Design Questions
No ratings yet
Automata X Compiler Design Questions
31 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
55 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
16 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
FLAT Course Plan
No ratings yet
FLAT Course Plan
10 pages
Word Level Analysis NLP Mod 2
No ratings yet
Word Level Analysis NLP Mod 2
18 pages
NLP Question Bank
No ratings yet
NLP Question Bank
7 pages
Unit-1,2 Questions
No ratings yet
Unit-1,2 Questions
14 pages
Question Bank - 18CS61
No ratings yet
Question Bank - 18CS61
9 pages
COS 320 Compilers: David Walker
No ratings yet
COS 320 Compilers: David Walker
38 pages
Atc Question Bank
No ratings yet
Atc Question Bank
7 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
Lecture 3 (30-1-23)
No ratings yet
Lecture 3 (30-1-23)
11 pages
RG CFG AMbiguity
No ratings yet
RG CFG AMbiguity
8 pages
Btech Cse 8 Sem Natural Language Processing 2010
No ratings yet
Btech Cse 8 Sem Natural Language Processing 2010
7 pages
CD Model Set-5 Answer Key
No ratings yet
CD Model Set-5 Answer Key
32 pages
Cs6660-Compiler Design-556285100-Cd Updated QB
No ratings yet
Cs6660-Compiler Design-556285100-Cd Updated QB
5 pages
PCD - QN Bank
No ratings yet
PCD - QN Bank
3 pages
Introduction to Formal Languages
From Everand
Introduction to Formal Languages
György E. Révész
2/5 (1)
Creating Melodies
From Everand
Creating Melodies
Stefan Hollos
No ratings yet
Graded Lessons in English An Elementary English Grammar Consisting of One Hundred Practical Lessons, Carefully Graded and Adapted to the Class-Room
From Everand
Graded Lessons in English An Elementary English Grammar Consisting of One Hundred Practical Lessons, Carefully Graded and Adapted to the Class-Room
Brainerd Kellogg
1.5/5 (4)
Lesley Jeffries' CRITICAL STYLISTICS: The Power of English: Jason Reeve
No ratings yet
Lesley Jeffries' CRITICAL STYLISTICS: The Power of English: Jason Reeve
5 pages
Kapampangan Language
No ratings yet
Kapampangan Language
3 pages
G3 TESOL KEY ANSWER Update
0% (1)
G3 TESOL KEY ANSWER Update
3 pages
2022 Released Items Ela g6
No ratings yet
2022 Released Items Ela g6
47 pages
Loose Change by Andrea Levy
No ratings yet
Loose Change by Andrea Levy
13 pages
Grade 6 Plan of Action Intervention Least Mastered Skills
100% (1)
Grade 6 Plan of Action Intervention Least Mastered Skills
5 pages
International Journal of Linguistics, Literature and Translation (IJLLT)
No ratings yet
International Journal of Linguistics, Literature and Translation (IJLLT)
12 pages
SPELD SA Set 5 Zack Hid From Dad-DS
No ratings yet
SPELD SA Set 5 Zack Hid From Dad-DS
17 pages
LK 1 Lembar Kerja Belajar Mandiri Modul 3
No ratings yet
LK 1 Lembar Kerja Belajar Mandiri Modul 3
6 pages
Class 3 Eng II Lesson 1
No ratings yet
Class 3 Eng II Lesson 1
2 pages
Nouns From Adjectives
No ratings yet
Nouns From Adjectives
2 pages
Eden 410 Lesson Plan
No ratings yet
Eden 410 Lesson Plan
9 pages
Index
No ratings yet
Index
1 page
Grading Rubric For Synthesis Papers
100% (1)
Grading Rubric For Synthesis Papers
2 pages
Unit 12: Truck Stop: - Future Arrangements and Habits
No ratings yet
Unit 12: Truck Stop: - Future Arrangements and Habits
6 pages
List of Dialects of English - Wikipedia
No ratings yet
List of Dialects of English - Wikipedia
13 pages
2nd Assignment On Job Analysis
No ratings yet
2nd Assignment On Job Analysis
2 pages
Phrases Exercises en Ingles
No ratings yet
Phrases Exercises en Ingles
2 pages
Đề anh chuyên vào 10 Quốc Học-Huế 2016-17
100% (1)
Đề anh chuyên vào 10 Quốc Học-Huế 2016-17
8 pages
English Q1
No ratings yet
English Q1
6 pages
Grade 12 PRACTICAL RESEARCH 2 Week 8
No ratings yet
Grade 12 PRACTICAL RESEARCH 2 Week 8
18 pages
Bright SMP Chapter 2
No ratings yet
Bright SMP Chapter 2
12 pages
Wita Riahma (35) 8a: Activity 1
No ratings yet
Wita Riahma (35) 8a: Activity 1
3 pages
Ways of Forming Phraseological Units
No ratings yet
Ways of Forming Phraseological Units
13 pages
Abstract Theory
No ratings yet
Abstract Theory
4 pages
Reduced Relative Clauses - Grammar Bank Page 111 ST's
No ratings yet
Reduced Relative Clauses - Grammar Bank Page 111 ST's
1 page
Netherlands - International Conference Part-20
No ratings yet
Netherlands - International Conference Part-20
171 pages
Class # 13 Because, Since, Because of
No ratings yet
Class # 13 Because, Since, Because of
5 pages
English Grammar Class 1 Noun - Learn and Practice - Download Free PDF
No ratings yet
English Grammar Class 1 Noun - Learn and Practice - Download Free PDF
8 pages

NLP Assignment

Uploaded by

NLP Assignment

Uploaded by

Paper Code: PEC-CSE702A

Paper Name: Speech & Natural Language Processing

To be solved for better understanding of the subject.

What are the main challenges in NLP?

Mention some areas of NLP application.

What are the various models of information extraction?

What is meant by data augmentation?

What is the meaning of Text Normalization in NLP?

Which step is required as pre-processing step in language processing?

You might also like