GRADE-10
ARTIFICIAL INTELLIGENCE -PREBOARD 1-REVISION WORKSHEET
CHAPTORS: NLP AND EVALUATION
I.CHOOSE THE CORRECT ANSWER: (PART-B)
Natural Language Processing (NLP) MCQs
1. Which of the following is NOT a common NLP task?
a) Sentiment analysis
b) Part-of-speech tagging
c) Image recognition
d) Named entity recognition
Answer: c) Image recognition
2. Which type of ambiguity occurs when a word has multiple meanings?
a) Lexical ambiguity
b) Syntactic ambiguity
c) Pragmatic ambiguity
d) Morphological ambiguity
Answer: a) Lexical ambiguity
3. What does "tokenization" mean in NLP?
a) Removing stop words from text
b) Splitting text into smaller units, such as words or sentences
c) Stemming words to their base form
d) Identifying named entities in text
Answer: b) Splitting text into smaller units, such as words or sentences
4. Which NLP technique is used to remove suffixes from words to get their root form?
a) Lemmatization
b) Tokenization
c) Stemming
d) Parsing
Answer: c) Stemming
5. What is the purpose of a stop-word list in text preprocessing?
a) To include domain-specific words
b) To exclude commonly used words that add little meaning
c) To rank keywords in a document
d) To improve the syntactic structure of text
Answer: b) To exclude commonly used words that add little meaning
Bag of Words Algorithm MCQs
1. What is the main purpose of the Bag of Words (BoW) model?
a) Capture word order in a text
b) Represent text as a numerical feature vector
c) Identify the grammatical structure of a sentence
d) Generate word embeddings
Answer: b) Represent text as a numerical feature vector
2. In the BoW model, the rows of the matrix represent:
a) Unique words in the corpus
b) Documents in the corpus
c) Sentences in the corpus
d) Word embeddings
Answer: b) Documents in the corpus
3. What does the value of a cell in a BoW matrix indicate?
a) Frequency of a word in a document
b) Position of a word in the text
c) Total word count in the document
d) Similarity score between words
Answer: a) Frequency of a word in a document
4. Which of the following is a disadvantage of the Bag of Words model?
a) It captures semantic relationships between words
b) It is computationally inexpensive
c) It ignores the order of words in a sentence
d) It works well for small datasets
Answer: c) It ignores the order of words in a sentence
5. Which of the following processes is NOT part of text normalization?
A. Tokenization
B. Stemming
C. Lowercasing
D. Data visualization
Answer: D. Data visualization
6. What is the main purpose of text normalization in NLP?
A. To increase the dataset size
B. To standardize text data for analysis
C. To create a graphical representation of text
D. To remove all punctuation from text
Answer: B. To standardize text data for analysis
7. Which of the following techniques is used to reduce words to their base or root form?
A. Tokenization
B. Lemmatization
C. Stopword removal
D. Sentence segmentation
Answer: B. Lemmatization
8. What is the difference between stemming and lemmatization?
A. Stemming is faster but less accurate than lemmatization
B. Lemmatization ignores grammar rules, while stemming follows them
C. Stemming produces only nouns, while lemmatization includes verbs
D. There is no difference between them
Answer: A. Stemming is faster but less accurate than lemmatization
9. Lowercasing in text normalization involves:
A. Removing lowercase letters from text
B. Converting all text to lowercase
C. Replacing lowercase words with their synonyms
D. Ignoring words that are already lowercase
Answer: B. Converting all text to lowercase
10. Which of the following can be used to remove unnecessary words such as "the," "is," or
"and" from text?
A. Stopword removal
B. Tokenization
C. Lemmatization
D. Stemming
Answer: A. Stopword removal
11. If you want to replace all occurrences of numbers in text with a specific token like <NUM>,
which process would you use?
A. Lemmatization
B. Noise removal
C. Numerical substitution
D. Regular expression substitution
Answer: D. Regular expression substitution
12. In what situation would removing punctuation during text normalization NOT be
recommended?
A. When punctuation helps in sentiment analysis
B. When performing topic modeling
C. When normalizing text for keyword search
D. When preparing text for machine translation
Answer: A. When punctuation helps in sentiment analysis
13. What does TF-IDF primarily measure in a text document?
A. The position of terms in a document
B. The importance of terms in a document relative to a corpus
C. The number of documents in a corpus
D. The similarity between two documents
Answer: B. The importance of terms in a document relative to a corpus
14. In TF-IDF, what does "TF" stand for?
A. Text Frequency
B. Term Frequency
C. Token Frequency
D. Total Frequency
Answer: B. Term Frequency
15. What does "IDF" (Inverse Document Frequency) penalize in the calculation of TF-IDF?
A. Words that are too frequent across the corpus
B. Words that are unique to a document
C. Words that appear in the first paragraph
D. Words with multiple meanings
Answer: A. Words that are too frequent across the corpus
16. What is the main disadvantage of using TF-IDF for document representation?
A. It captures semantic meaning but not word frequency
B. It cannot be used for large corpora
C. It ignores the order of words in text
D. It assigns the same weight to all terms in a document
Answer: C. It ignores the order of words in text
17. In a scenario where a word appears in almost every document of a corpus, what will its IDF
score be?
A. High
B. Low
C. Zero
D. Undefined
Answer: B. Low
18. Which of the following is a common use case of TF-IDF in NLP?
A. Named entity recognition
B. Summarization of text
C. Feature extraction for text classification
D. Parsing grammar rules
Answer: C. Feature extraction for text classification
19. TF-IDF values for words in a document are:
A. Always integers
B. Always in the range of 0 to 1
C. Weighted values based on frequency and document relevance
D. Normalized probabilities of word occurrence
Answer: C. Weighted values based on frequency and document relevance
20. What does syntax in NLP deal with?
A. The structure and grammar of sentences
B. The meaning of sentences
C. Identification of entities in text
D. Analysis of emotions in sentences
Answer: A. The structure and grammar of sentences
21. Which of the following is an example of a syntactically incorrect sentence?
A. "The dog barks loudly."
B. "Dog the loudly barks."
C. "The sun is bright."
D. "She sings beautifully."
Answer: B. "Dog the loudly barks."
22. Semantics in NLP primarily focuses on:
A. Correct arrangement of words
B. Assigning meaning to words and sentences
C. Removing punctuation from text
D. Tokenizing sentences into words
Answer: B. Assigning meaning to words and sentences
23. Which of the following is an example of a sentence that is syntactically correct but
semantically incorrect?
A. "The cat sleeps on the mat."
B. "The quick brown fox jumps over the lazy dog."
C. "The green idea sleeps furiously."
D. "Runs dog the park in."
Answer: C. "The green idea sleeps furiously."
24. What is a syntax tree used for in NLP?
A. Representing the grammatical structure of a sentence
B. Determining the emotional tone of a text
C. Identifying named entities in a sentence
D. Translating text into another language
Answer: A. Representing the grammatical structure of a sentence
25. In NLP, semantic analysis is useful for:
A. Resolving grammatical errors
B. Understanding the context and meaning of words
C. Splitting a sentence into tokens
D. Removing stopwords from text
Answer: B. Understanding the context and meaning of words
26. Which of the following tasks is most related to syntax in NLP?
A. Part-of-Speech (POS) tagging
B. Word sense disambiguation
C. Sentiment analysis
D. Named entity recognition
Answer: A. Part-of-Speech (POS) tagging
27. Which of the following best describes the relationship between syntax and semantics?
A. Syntax focuses on sentence structure, while semantics focuses on meaning.
B. Syntax focuses on the meaning of sentences, while semantics focuses on grammar.
C. Syntax and semantics are unrelated in NLP.
D. Syntax focuses on tokens, while semantics focuses on punctuation.
Answer: A. Syntax focuses on sentence structure, while semantics focuses on meaning.
28. Which of these NLP applications primarily involves semantic analysis?
A. Grammar checking
B. Word sense disambiguation
C. Sentence segmentation
D. Parsing
Answer: B. Word sense disambiguation
29. In NLP, a sentence that is syntactically correct but semantically meaningless is known
as:
A. A grammatically incorrect sentence
B. A semantically ambiguous sentence
C. A syntactically valid but nonsensical sentence
D. A semantically rich sentence
Answer: C. A syntactically valid but nonsensical sentence
30. What is the primary purpose of evaluation in the AI project cycle?
A. To collect data for training models
B. To measure the performance of the AI system
C. To deploy the AI model into production
D. To visualize the dataset
Answer: B. To measure the performance of the AI system
EVALUATION MCQs
31. Which of the following metrics is commonly used to evaluate classification models?
A. Mean Absolute Error (MAE)
B. Confusion Matrix
C. Root Mean Square Error (RMSE)
D. BLEU Score
Answer: B. Confusion Matrix
32. Which of the following is NOT a key step in evaluation during the AI project cycle?
A. Defining evaluation metrics
B. Comparing the model with baseline methods
C. Gathering raw data from sensors
D. Conducting error analysis
Answer: C. Gathering raw data from sensors
33. Which of these evaluation metrics is specifically used for regression models?
A. Precision
B. F1-Score
C. Mean Squared Error (MSE)
D. ROC-AUC
Answer: C. Mean Squared Error (MSE)
34. What is the role of a validation set in the evaluation process?
A. To train the AI model
B. To fine-tune hyperparameters
C. To compare the model with other algorithms
D. To test the model in real-world scenarios
Answer: B. To fine-tune hyperparameters
35. Which of the following is a qualitative method of evaluation in AI projects?
A. Analyzing the confusion matrix
B. Conducting user feedback sessions
C. Calculating accuracy
D. Computing recall
Answer: B. Conducting user feedback sessions
36. Why is it important to evaluate an AI system against a baseline model?
A. To measure if the model is overfitting
B. To identify whether the new model outperforms simple approaches
C. To calculate the precision-recall tradeoff
D. To test the model's deployment readiness
Answer: B. To identify whether the new model outperforms simple approaches
37. When an AI model performs well on the training data but poorly on evaluation data, it is
likely:
A. Underfitting
B. Overfitting
C. Well-generalized
D. Regularized
Answer: B. Overfitting
38. Which metric is most appropriate to evaluate the balance between precision and recall in
a binary classification problem?
A. Accuracy
B. F1-Score
C. Mean Absolute Error
D. Log Loss
Answer: B. F1-Score
39. What is the purpose of a test set in the evaluation process?
A. To train the AI model
B. To evaluate the final performance of the AI model on unseen data
C. To validate hyperparameter tuning
D. To optimize the training process
Answer: B. To evaluate the final performance of the AI model on unseen data
40. What is the primary purpose of a confusion matrix?
A. To visualize model architecture
B. To measure classification performance
C. To optimize hyperparameters
D. To reduce overfitting
Answer: B. To measure classification performance
41. Which value in a confusion matrix represents the number of correctly classified positive
samples?
A. True Positive (TP)
B. False Positive (FP)
C. True Negative (TN)
D. False Negative (FN)
Answer: A. True Positive (TP)
42. In a confusion matrix, False Negatives (FN) refer to:
A. Negative samples incorrectly predicted as positive
B. Positive samples incorrectly predicted as negative
C. Negative samples correctly predicted as negative
D. Positive samples correctly predicted as positive
Answer: B. Positive samples incorrectly predicted as negative
43. What does the sum of all the values in a confusion matrix represent?
A. Total number of correctly classified samples
B. Total number of misclassified samples
C. Total number of samples in the dataset
D. Total number of features in the dataset
Answer: C. Total number of samples in the dataset
44. Precision measures the ratio of:
A. Correct positive predictions to total actual positives
B. Correct negative predictions to total negatives
C. Correct positive predictions to total positive predictions
D. Correct positive predictions to total predictions
Answer: C. Correct positive predictions to total positive predictions
45. Which of the following metrics balances Precision and Recall?
A. Accuracy
B. F1-Score
C. Specificity
D. Log Loss
Answer: B. F1-Score
46. In a confusion matrix, what does False Positive (FP) indicate?
A. A negative sample classified as positive
B. A positive sample classified as negative
C. A correctly classified negative sample
D. A correctly classified positive sample
Answer: A. A negative sample classified as positive
47. If a model predicts all samples as positive, which metric will be high regardless of
performance?
A. Precision
B. Recall
C. Accuracy
D. Specificity
Answer: B. Recall