0% found this document useful (0 votes)

15 views5 pages

Stemming Is The Process of Reducing Words To Their Base or Root Form (E.g., "Running"

The document provides a comprehensive overview of Natural Language Processing (NLP), covering fundamental concepts, intermediate techniques, and advanced methodologies. Key topics include tokenization, machine translation, language models like BERT and GPT, and various learning approaches such as transfer learning and self-supervised learning. It serves as a foundational guide for understanding the components and applications of NLP.

Uploaded by

Chakri Chakradhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views5 pages

Stemming Is The Process of Reducing Words To Their Base or Root Form (E.g., "Running"

Uploaded by

Chakri Chakradhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Basics of NLP (Questions 1-30)

1. What is NLP?

Answer: NLP (Natural Language Processing) is a subfield of artificial intelligence that focuses on the
interaction between computers and humans using natural language. It involves tasks like text
analysis, language generation, and understanding.

2. What are the key components of NLP?

Answer: Key components include:

• Tokenization

• Part-of-Speech (POS) Tagging

• Named Entity Recognition (NER)

• Syntax and Parsing

• Sentiment Analysis

• Machine Translation

3. What is tokenization?

Answer: Tokenization is the process of splitting text into smaller units called tokens, which can be
words, phrases, or sentences.

4. What is stemming?

Answer: Stemming is the process of reducing words to their base or root form (e.g., "running" →
"run").

5. What is lemmatization?

Answer: Lemmatization reduces words to their base or dictionary form (lemma), considering the
context (e.g., "better" → "good").

6. What is stop word removal?

Answer: Stop words are common words (e.g., "the", "is", "and") that are often removed to focus on
meaningful words in text analysis.

7. What is Part-of-Speech (POS) tagging?

Answer: POS tagging assigns grammatical labels (e.g., noun, verb, adjective) to each word in a
sentence.

8. What is Named Entity Recognition (NER)?

Answer: NER identifies and classifies entities in text into categories like names, dates, organizations,
etc.

9. What is a corpus?

Answer: A corpus is a large and structured collection of texts used for linguistic analysis and training
NLP models.
10. What is the difference between syntax and semantics?

Answer: Syntax refers to the structure of sentences, while semantics deals with the meaning of
words and sentences.

11. What is a bag-of-words model?

Answer: A bag-of-words model represents text as an unordered collection of words, ignoring

grammar and word order but keeping track of word frequency.

12. What is TF-IDF?

Answer: TF-IDF (Term Frequency-Inverse Document Frequency) is a statistical measure used to

evaluate the importance of a word in a document relative to a corpus.

13. What is word embedding?

Answer: Word embedding is a dense vector representation of words that captures semantic
relationships (e.g., Word2Vec, GloVe).

14. What is Word2Vec?

Answer: Word2Vec is a neural network-based model that learns word embeddings by predicting
words in a context (CBOW) or predicting context from a word (Skip-gram).

15. What is GloVe?

Answer: GloVe (Global Vectors for Word Representation) is an unsupervised learning algorithm for
obtaining word embeddings by factorizing a word co-occurrence matrix.

16. What is a language model?

Answer: A language model predicts the probability of a sequence of words, often used in text
generation and speech recognition.

17. What is n-gram?

Answer: An n-gram is a contiguous sequence of n items (words, characters) from a given text.

18. What is sentiment analysis?

Answer: Sentiment analysis determines the emotional tone or opinion expressed in text (e.g.,
positive, negative, neutral).

19. What is text normalization?

Answer: Text normalization is the process of transforming text into a consistent format (e.g.,
lowercasing, removing punctuation).

20. What is the difference between rule-based and statistical NLP?

Answer: Rule-based NLP uses handcrafted linguistic rules, while statistical NLP relies on machine
learning and data-driven approaches.

21. What is a confusion matrix in NLP?

Answer: A confusion matrix is used to evaluate classification models by showing true positives, false
positives, true negatives, and false negatives.
22. What is precision and recall?

Answer: Precision measures the accuracy of positive predictions, while recall measures the fraction
of true positives correctly identified.

23. What is F1-score?

Answer: F1-score is the harmonic mean of precision and recall, providing a balance between the
two.

24. What is overfitting in NLP models?

Answer: Overfitting occurs when a model performs well on training data but poorly on unseen data
due to excessive complexity.

25. What is underfitting in NLP models?

Answer: Underfitting occurs when a model is too simple to capture patterns in the data, resulting in
poor performance on both training and test data.

26. What is cross-validation?

Answer: Cross-validation is a technique to evaluate model performance by partitioning data into

multiple subsets and training/testing on different combinations.

27. What is the difference between supervised and unsupervised learning in NLP?

Answer: Supervised learning uses labeled data, while unsupervised learning works with unlabeled
data to find patterns.

28. What is a chatbot?

Answer: A chatbot is an NLP application that simulates human conversation using text or voice.

29. What is machine translation?

Answer: Machine translation automatically translates text from one language to another (e.g.,
Google Translate).

30. What is text summarization?

Answer: Text summarization generates a concise summary of a longer text while retaining key
information.

Intermediate NLP (Questions 31-70)

31. What is sequence-to-sequence (Seq2Seq) modeling?

Answer: Seq2Seq is a framework for tasks like machine translation, where an input sequence is
mapped to an output sequence using encoder-decoder architectures.

32. What is attention mechanism?

Answer: Attention mechanism allows models to focus on specific parts of the input sequence,
improving performance in tasks like translation.
33. What is Transformer architecture?

Answer: Transformer is a neural network architecture based on self-attention mechanisms, enabling

parallel processing and state-of-the-art performance in NLP.

34. What is BERT?

Answer: BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language

model that uses bidirectional context for tasks like question answering and sentiment analysis.

35. What is GPT?

Answer: GPT (Generative Pre-trained Transformer) is a language model that uses autoregressive
transformers for text generation.

36. What is the difference between BERT and GPT?

Answer: BERT is bidirectional and focuses on understanding context, while GPT is unidirectional and
focuses on text generation.

37. What is transfer learning in NLP?

Answer: Transfer learning involves using pre-trained models (e.g., BERT, GPT) and fine-tuning them
for specific tasks.

38. What is fine-tuning in NLP?

Answer: Fine-tuning is the process of adapting a pre-trained model to a specific task by training it on
a smaller, task-specific dataset.

39. What is zero-shot learning in NLP?

Answer: Zero-shot learning involves training a model to perform tasks it has never seen during
training, using generalizable knowledge.

40. What is few-shot learning in NLP?

Answer: Few-shot learning involves training a model with very few examples of a task.

41. What is a pre-trained language model?

Answer: A pre-trained language model is trained on a large corpus and can be fine-tuned for specific
NLP tasks.

42. What is perplexity in NLP?

Answer: Perplexity measures how well a language model predicts a sample, with lower values
indicating better performance.

43. What is beam search?

Answer: Beam search is a decoding algorithm used in text generation to find the most likely
sequence of words.

44. What is greedy search?

Answer: Greedy search selects the most likely word at each step in text generation, without
considering future steps.
45. What is a dependency tree?

Answer: A dependency tree represents the grammatical structure of a sentence by showing

relationships between words.

46. What is coreference resolution?

Answer: Coreference resolution identifies expressions that refer to the same entity in a text.

47. What is semantic role labeling?

Answer: Semantic role labeling identifies the roles of words in a sentence (e.g., agent, patient).

48. What is topic modeling?

Answer: Topic modeling is an unsupervised technique to identify topics in a collection of documents

(e.g., LDA).

49. What is Latent Dirichlet Allocation (LDA)?

Answer: LDA is a probabilistic model used for topic modeling, representing documents as mixtures
of topics.

50. What is word sense disambiguation?

Answer: Word sense disambiguation determines the correct meaning of a word based on context.

Advanced NLP (Questions 71-100)

71. What is self-supervised learning in NLP?

Answer: Self-supervised learning uses unlabeled data to create supervised tasks, such as predicting
masked words in BERT.

72. What is masked language modeling?

Answer: Masked language modeling involves predicting masked words in a sentence, used in models
like BERT.

73. What is contrastive learning in NLP?

Answer: Contrastive learning trains models to distinguish between similar and dissimilar pairs of
data points.

74. What is adversarial training in NLP?

Answer: Adversarial training improves model robustness by exposing it to adversarial examples.

75. What is multi-task learning in NLP?

Answer: Multi-task learning trains a model on multiple related tasks simultaneously to improve
generalization.

Top 50 LinkedIn LLM Interview Questions
100% (1)
Top 50 LinkedIn LLM Interview Questions
12 pages
UPI Fraud Transaction Detection Using Machine Learning
No ratings yet
UPI Fraud Transaction Detection Using Machine Learning
79 pages
NLP Quiz Seg 1 To 4
No ratings yet
NLP Quiz Seg 1 To 4
9 pages
NLP Interview Questions 2025
No ratings yet
NLP Interview Questions 2025
4 pages
NLP All Answers
No ratings yet
NLP All Answers
172 pages
(Final) 1000+ SNLP MCQ
No ratings yet
(Final) 1000+ SNLP MCQ
688 pages
Introduction To Computational Linguistics and Natural Language Processing
100% (1)
Introduction To Computational Linguistics and Natural Language Processing
182 pages
NLP 1-3
No ratings yet
NLP 1-3
34 pages
Anushasri939@Gmail - Com NLP Hackathon Level1
No ratings yet
Anushasri939@Gmail - Com NLP Hackathon Level1
20 pages
Unit 1 TB
No ratings yet
Unit 1 TB
19 pages
Final Assignment AI
No ratings yet
Final Assignment AI
33 pages
Net 2018 07 026
No ratings yet
Net 2018 07 026
29 pages
Demirhan Thesis 2016
No ratings yet
Demirhan Thesis 2016
60 pages
NLP Q&A1a Text Processing
No ratings yet
NLP Q&A1a Text Processing
16 pages
Part B Unit 6 Natural Language Processing
No ratings yet
Part B Unit 6 Natural Language Processing
16 pages
NLP Chapter - 2 Sheet
No ratings yet
NLP Chapter - 2 Sheet
7 pages
Final Report End
No ratings yet
Final Report End
92 pages
NLP Term Work
No ratings yet
NLP Term Work
6 pages
Quiz Unit No. 1 NLP
No ratings yet
Quiz Unit No. 1 NLP
2 pages
Leveraging Linguistics and Computer Science
No ratings yet
Leveraging Linguistics and Computer Science
10 pages
Lecture 3 Notebook
No ratings yet
Lecture 3 Notebook
6 pages
NLP Grade 10 2023-2024
No ratings yet
NLP Grade 10 2023-2024
72 pages
Ai (NLP and Evaluation)
No ratings yet
Ai (NLP and Evaluation)
6 pages
Deutsch
No ratings yet
Deutsch
4 pages
Python For Data Analysis - The Ultimate Beginner's Guide To Learn Programming in Python For Data Science With Pandas and NumPy, Master Statistical Analysis, and Visualization (2020)
No ratings yet
Python For Data Analysis - The Ultimate Beginner's Guide To Learn Programming in Python For Data Science With Pandas and NumPy, Master Statistical Analysis, and Visualization (2020)
109 pages
Natural Language Processing For Sentiment Analysis - Ankur Shukla
No ratings yet
Natural Language Processing For Sentiment Analysis - Ankur Shukla
27 pages
Minimalist White and Grey Professional Resume
No ratings yet
Minimalist White and Grey Professional Resume
1 page
Sentiment Analysis of Code-Mixed Social Media Text
No ratings yet
Sentiment Analysis of Code-Mixed Social Media Text
18 pages
Novel Approach On Audio To Text Sentiment Analysis On Product Reviews
No ratings yet
Novel Approach On Audio To Text Sentiment Analysis On Product Reviews
8 pages
Literature Review
No ratings yet
Literature Review
8 pages
A231 Case Study 1 Full
No ratings yet
A231 Case Study 1 Full
8 pages
Predictive Analytics For Market Trends Using AI: A Study in Consumer Behavior
No ratings yet
Predictive Analytics For Market Trends Using AI: A Study in Consumer Behavior
14 pages
NLP - Assessment
No ratings yet
NLP - Assessment
7 pages
NLP Short Questions
No ratings yet
NLP Short Questions
1 page
NLP
No ratings yet
NLP
14 pages
Chap 6 Exam
No ratings yet
Chap 6 Exam
13 pages
AI UNIT 6 and UNIT 7 Question and Answers
No ratings yet
AI UNIT 6 and UNIT 7 Question and Answers
10 pages
Neo Analytics
No ratings yet
Neo Analytics
3 pages
J Adhoc 2018 05 008
No ratings yet
J Adhoc 2018 05 008
16 pages
Company Facebook and Crisis Signal: The Case of Malaysian Airline Companies
No ratings yet
Company Facebook and Crisis Signal: The Case of Malaysian Airline Companies
14 pages
Resume 3
No ratings yet
Resume 3
2 pages
NLP Handwritten Notes
No ratings yet
NLP Handwritten Notes
26 pages
Rizki CV
No ratings yet
Rizki CV
4 pages
The 7 Basic Functions of Text Analytics
No ratings yet
The 7 Basic Functions of Text Analytics
11 pages
978-81-322-2494-5-1-80 Parte 1
No ratings yet
978-81-322-2494-5-1-80 Parte 1
80 pages
Assignment 1: Q1. Task Description
No ratings yet
Assignment 1: Q1. Task Description
12 pages
NLP Notes
No ratings yet
NLP Notes
10 pages
Marketing Intelligence From Data Mining Perspective - A Literature Review
No ratings yet
Marketing Intelligence From Data Mining Perspective - A Literature Review
7 pages
Senticnet 5: Discovering Conceptual Primitives For Sentiment Analysis by Means of Context Embeddings
No ratings yet
Senticnet 5: Discovering Conceptual Primitives For Sentiment Analysis by Means of Context Embeddings
8 pages
Ai Unit-4 QB
No ratings yet
Ai Unit-4 QB
8 pages
Sentiment Analysis Final Documentation Report
50% (2)
Sentiment Analysis Final Documentation Report
21 pages
Board QP Solution and Notes
No ratings yet
Board QP Solution and Notes
36 pages
SKD Academy (CBSE) Session - 2024-2025 Subject - Artificial Intelligence (417) Important Questions Chap - NLP
No ratings yet
SKD Academy (CBSE) Session - 2024-2025 Subject - Artificial Intelligence (417) Important Questions Chap - NLP
7 pages
NLP Notes
No ratings yet
NLP Notes
3 pages
Customer Sentiment Analysis Project
No ratings yet
Customer Sentiment Analysis Project
3 pages
Answer Key-3
No ratings yet
Answer Key-3
12 pages
NLP Lect 2
No ratings yet
NLP Lect 2
5 pages
CCS369 Two Marks
No ratings yet
CCS369 Two Marks
9 pages
Full NLP Answers
No ratings yet
Full NLP Answers
7 pages
NLP Questions
No ratings yet
NLP Questions
3 pages
SDSB 360 Issue 1 April 2020 PDF
No ratings yet
SDSB 360 Issue 1 April 2020 PDF
41 pages
NLP Chapter - 1 Sheet
No ratings yet
NLP Chapter - 1 Sheet
6 pages
Notes MSC NLP
No ratings yet
Notes MSC NLP
36 pages
NLP Question Bank
No ratings yet
NLP Question Bank
27 pages
LP V Oral Questions and Answers
No ratings yet
LP V Oral Questions and Answers
4 pages
?? ??? ????????? ?????????
No ratings yet
?? ??? ????????? ?????????
23 pages
NLP Programs
No ratings yet
NLP Programs
13 pages
NLP and Evaluation - MCQ
No ratings yet
NLP and Evaluation - MCQ
10 pages
Unit V Natural Language Processing
No ratings yet
Unit V Natural Language Processing
20 pages
ورقة الذكاء
No ratings yet
ورقة الذكاء
7 pages
Ai900 Cert1
No ratings yet
Ai900 Cert1
23 pages
Consumer Attitudes Towards Insurance Products
No ratings yet
Consumer Attitudes Towards Insurance Products
11 pages
NLP Sheets
No ratings yet
NLP Sheets
23 pages
Recomendation System Using Machine Learning
No ratings yet
Recomendation System Using Machine Learning
37 pages
Lec01introF23 PDF
No ratings yet
Lec01introF23 PDF
45 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
Chapter 1 Solutions
No ratings yet
Chapter 1 Solutions
5 pages
AI and Product Innovation New
No ratings yet
AI and Product Innovation New
21 pages
NLP Mod 1 SEE
No ratings yet
NLP Mod 1 SEE
7 pages
Unit 6 Natural Language Processing
No ratings yet
Unit 6 Natural Language Processing
10 pages
Unit-I QB
No ratings yet
Unit-I QB
5 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Part - A (2 Mark Questions)
No ratings yet
Part - A (2 Mark Questions)
35 pages
Lucas Paquetta Raw NLP
No ratings yet
Lucas Paquetta Raw NLP
12 pages
1 NLP
No ratings yet
1 NLP
26 pages
Ai 2
No ratings yet
Ai 2
7 pages
Lemmatization Is The Grouping Together of Different Forms of The Same Word. in Search
No ratings yet
Lemmatization Is The Grouping Together of Different Forms of The Same Word. in Search
11 pages
NLP MCQ 153 Out of 427 - Part One
No ratings yet
NLP MCQ 153 Out of 427 - Part One
30 pages
Pyhton Programming
No ratings yet
Pyhton Programming
63 pages
Introduction To NLP - First - Week - Lecture - 2st
No ratings yet
Introduction To NLP - First - Week - Lecture - 2st
4 pages
NLP QB
100% (2)
NLP QB
14 pages
Question Bank
No ratings yet
Question Bank
13 pages
Artificial Intelligence - 141727
100% (1)
Artificial Intelligence - 141727
11 pages
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
From Everand
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
Alexandra George
No ratings yet
50 Most Challenging Algebra Problems!
From Everand
50 Most Challenging Algebra Problems!
Andrei Besedin
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet

Stemming Is The Process of Reducing Words To Their Base or Root Form (E.g., "Running"

Uploaded by

Stemming Is The Process of Reducing Words To Their Base or Root Form (E.g., "Running"

Uploaded by

Basics of NLP (Questions 1-30)

2. What are the key components of NLP?

Answer: Key components include:

• Part-of-Speech (POS) Tagging

• Named Entity Recognition (NER)

• Syntax and Parsing

6. What is stop word removal?

7. What is Part-of-Speech (POS) tagging?

8. What is Named Entity Recognition (NER)?

11. What is a bag-of-words model?

Answer: A bag-of-words model represents text as an unordered collection of words, ignoring

12. What is TF-IDF?

Answer: TF-IDF (Term Frequency-Inverse Document Frequency) is a statistical measure used to

13. What is word embedding?

14. What is Word2Vec?

15. What is GloVe?

16. What is a language model?

17. What is n-gram?

18. What is sentiment analysis?

19. What is text normalization?

20. What is the difference between rule-based and statistical NLP?

21. What is a confusion matrix in NLP?

23. What is F1-score?

24. What is overfitting in NLP models?

25. What is underfitting in NLP models?

26. What is cross-validation?

Answer: Cross-validation is a technique to evaluate model performance by partitioning data into

28. What is a chatbot?

29. What is machine translation?

30. What is text summarization?

Intermediate NLP (Questions 31-70)

31. What is sequence-to-sequence (Seq2Seq) modeling?

32. What is attention mechanism?

Answer: Transformer is a neural network architecture based on self-attention mechanisms, enabling

34. What is BERT?

Answer: BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language

35. What is GPT?

36. What is the difference between BERT and GPT?

37. What is transfer learning in NLP?

38. What is fine-tuning in NLP?

39. What is zero-shot learning in NLP?

40. What is few-shot learning in NLP?

41. What is a pre-trained language model?

42. What is perplexity in NLP?

43. What is beam search?

44. What is greedy search?

Answer: A dependency tree represents the grammatical structure of a sentence by showing

46. What is coreference resolution?

47. What is semantic role labeling?

48. What is topic modeling?

Answer: Topic modeling is an unsupervised technique to identify topics in a collection of documents

49. What is Latent Dirichlet Allocation (LDA)?

50. What is word sense disambiguation?

Advanced NLP (Questions 71-100)

71. What is self-supervised learning in NLP?

72. What is masked language modeling?

73. What is contrastive learning in NLP?

74. What is adversarial training in NLP?

Answer: Adversarial training improves model robustness by exposing it to adversarial examples.

75. What is multi-task learning in NLP?

You might also like