Module 1
Module 1
2. With examples explain the different types of NER attributes. CO1 BL2 10 Marks
3. What do you understand about Natural language processing? CO1 BL1 2 Marks
5. List any two real life applications of NLP. CO1 BL1 2 Marks
6. Explain the difference between precision and recall in information retrieval. CO1 BL2
5 Marks
12. Why is Multi word tokenization preferred over Single word tokenization? CO1 BL1
2 Marks
16. List the different types of morphology available CO1 BL1 2 Marks
17. What is the difference between NLP and NLU? CO1 BL1 2 Marks
19. State the difference between word and sentence tokenization? CO1 BL1 2 Marks
20. What are the phases of problem-solving in NLP? CO1 BL1 5 Marks
21. Explain the process of word tokenization with example. CO1 BL1 5 Marks
22. How does Named Entity Recognizer work? CO1 BL1 5 Marks
23. What are the benefits of eliminating stop words? Give some examples where stop word
elimination may be harmful. CO1 BL3 5 Marks
24. What do you mean by RegEx? Explain with example. CO1 BL1 5 Marks
26. Write a regular expression to represent a set of all strings over {a, b} of even length. CO1
BL3 5 Marks
27. Write a regular expression to represent a set of all strings over {a, b} of length 4 starting with
an a. CO1 BL3 5 Marks
28. Write a regular expression to represent a set of all strings over {a, b} containing at least one
a. CO1 BL3 5 Marks
29. Compare and contrast NLTK and Spacy, highlighting their differences. CO1 BL2 5
Marks
30. What is a Bag of Words? Explain with examples. CO1 BL2 5 Marks
31. Differentiate regular grammar and regular expression. CO1 BL3 5 Marks
32. Describe the word and sentence tokenization steps with the help of an example. CO1 BL2
10 Marks
33. How can the common challenges faced in morphological analysis in natural language
processing be overcome? CO1 BL3 10 Marks
34. Derive Minimum Edit Distance Algorithm and compute the minimum edit distance between
the words “MAM” and “MADAM”. CO1 BL4 10 Marks
35. Discuss the problem-solving approaches of any two real-life applications of Information
Extraction and NER in Natural Language Processing. CO1 BL1 10 Marks
36. How to solve any application of NLP. Justify with an example. CO4 BL5 10 Marks
37. What is Corpora? Define the steps of creating a corpus for a specific task. CO1 BL2
10 Marks
39. State the different applications of Sentiment analysis and Opinion mining with examples.
Write down the variations as well. CO1 BL3 10 Marks
42. Do you think any differences present between tokenization and normalization? Justify your
answer with examples. CO4 BL5 10 Marks
43. What makes part-of-speech (POS) tagging crucial in NLP, in your opinion? Give an example
to back up your response. CO4 BL4 5 Marks
44. Criticize the shortcomings of the fundamental Top-Down Parser. CO1 BL3 5 Marks
45. Do you believe there are any distinctions between prediction and classification? Illustrate
with an example. CO1 BL1 5 Marks
46. Explain the connection between word tokenization and phrase tokenization using examples.
How do both tokenization methods contribute to the development of NLP applications? CO1 BL3
10 Marks
47. “Natural Language Processing (NLP) has many real-life applications across various
industries.”- List any two real-life applications of Natural Language Processing. CO1 BL1 5
Marks
48. "Find all strings of length 5 or less in the regular set represented by the following regular
expressions:
(a) (ab + a)*(aa + b)
3. the set of all strings from the alphabet a,b such that each a is immediately preceded by and
immediately followed by a b; CO1 BL4 10 Marks
51. Differentiate regular grammar and regular expression CO1 BL3 5 Marks
58. Find the minimum edit distance between two strings ELEPHANT and RELEVANT? CO3
BL5 10 Marks
59. If str1 = " SUNDAY " and str2 = "SATURDAY" is given, calculate the minimum edit distance
between the two strings. CO1 BL5 10 Marks
60. List the different types of morphology available. CO4 BL2 5 Marks
63. State with example the difference between stemming and lemmatization. CO4 BL4
5 Marks
64. Write down the different stages of NLP pipeline. CO1 BL4 10 Marks
65. What is your understanding about Chatbot in the context of NLP? CO3 BL3 10
Marks
66. Write short note on text pre-processing in the context of NLP. Discuss outliers and how to
handle them CO3 BL2 10 Marks
67. Explain with example the challenges with sentence tokenization. CO3 BL3 5
Marks
68. Explain some of the common NLP tasks. CO1 BL2 5 Marks
69. What do you mean by text extraction and cleanup? Discuss with examples. CO3 BL2
10 Marks
70. What is word sense ambiguity in NLP? Explain with examples. CO3 BL1 5 Marks
71. Write short note on Bag of Words (BOW). CO1 BL3 10 Marks
74. Consider a document containing 100 words wherein the word apple appears 5 times and
assume we have 10 million documents and the word apple appears in one thousandth of these.
Then, calculate the term frequency and inverse document frequency? CO4 BL5 10 Marks
75. Explain the relationship between Singular Value Decomposition, Matrix Completion and
Matrix Factorization? CO1 BL3 5 Marks
76. Give two examples that illustrate the significance of regular expressions in NLP. CO1 BL1
5 Marks
77. Why is multiword tokenization preferable over single word tokenization in NLP? Give
examples. CO1 BL1 5 Marks
78. Differentiate between formal language and natural language. CO3 BL1 10 Marks
79. Explain lexicon, lexeme and the different types of relations that hold between lexemes. CO1
BL1 10 Marks
80. State the advantages of bottom-up chart parser compared to top-down parsing. CO1 BL1
10 Marks
81. Marks
82. Describe the Skip-gram model and its intuition in word embeddings. CO1 BL2 10
Marks
83. Explain the concept of Term Frequency-Inverse Document Frequency (TF-IDF) based ranking
in information retrieval. CO1 BL2 10 Marks
84. Tokenize and tag the following sentence: CO1 BL1 2 Marks
85. What different pronunciations and parts-of-speech are involved? CO1 BL1 2
Marks
86. Compute the edit distance (using insertion cost 1, deletion cost 1, substitution cost 1) of
“intention” and “execution”. Show your work using the edit distance grid. CO1 BL4 10
Marks
87. What is the purpose of constructing corpora in Natural Language Processing (NLP) research?
CO1 BL2 5 Marks
88. What role do regular expressions play in searching and manipulating text data? CO1 BL3
5 Marks
89. Explain the purpose of WordNet in Natural Language Processing (NLP). CO1 BL4 10
Marks
91. Describe the class of strings matched by the following regular expressions: a. [a-zA-Z]+ b. [A-
Z][a-z]* CO1 BL4 10 Marks
92. Extract all email addresses from the following: “Contact us at [email protected] or
[email protected].” CO1 BL4 10 Marks
93. This regex is intended to match one or more uppercase letters followed by zero or more
digits. [A-Z] + [0-9]* However, it has a problem. What is it, and how can it be fixed?
CO1 BL4 10 Marks
94. Write a regex to find all dates in a text. The date formats should include:
DD-MM-YYYY
MM-DD-YYYY
95. Compute the minimum edit distance between the words MAMA and MADAAM. CO1 BL5
10 Marks
96. Evaluate the minimum edit distance in transforming the word ‘kitten’ to ‘sitting’ using
insertion, deletion, and substitution cost as 1. CO1 BL5 10 Marks