0% found this document useful (0 votes)
14 views7 pages

Test DS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

Test DS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

1. What is the purpose of a scatter plot in data visualization?

a. Displaying the distribution of categorical variables


b. b) Showing the relationship between two numerical variables
c. c) Highlighting the correlation between features
d. d) Visualizing time series data
2. In machine learning, what is the main goal of dimensionality reduction?
a. a) Increasing the number of features
b. b) Improving model complexity
c. c) Reducing the size of the dataset
d. d) Capturing relevant information while reducing noise
3. Which technique is commonly used to address the issue of overfitting in machine
learning?
a. a) Regularization
b. b) Data augmentation
c. c) Feature engineering
d. d) Ensemble methods
4. What does the AUC-ROC curve measure in binary classification?
a. a) Model accuracy
b. b) Precision
c. c) Recall
d. d) True positive rate vs. false positive rate
5. Which algorithm is suitable for clustering data when the number of clusters is
known in advance?
a. a) K-means
b. b) Hierarchical clustering
c. c) DBSCAN
d. d) Random Forest
6. Which statistical test is used to determine if there is a significant difference between
the means of two groups?
a. a) ANOVA
b. b) Chi-squared test
c. c) T-test
d. d) Pearson correlation
7. What is the purpose of the Levenshtein distance metric in natural language
processing?
a. a) Measuring document similarity
b. b) Evaluating sentiment analysis
c. c) Calculating word embeddings
d. d) Quantifying the difference between two strings
8. Which machine learning technique can handle both classification and regression
tasks?
a. a) Linear regression
b. b) Decision trees
c. c) Naive Bayes
d. d) Support Vector Machines
9. What is the "bias-variance trade-off" in machine learning?
a. a) Balancing the trade-off between bias and fairness in models
b. b) Balancing the trade-off between underfitting and overfitting
c. c) Balancing the trade-off between feature selection and feature extraction
d. d) Balancing the trade-off between model accuracy and interpretability
10. Which method is used for handling class imbalance in a binary classification
problem?
a. a) Data augmentation
b. b) Feature scaling
c. c) Regularization
d. d) Principal Component Analysis (PCA)
11. Which technique is used to assess the significance of variables in a linear regression
model?
a. a) p-value
b. b) R-squared
c. c) F-statistic
d. d) Mean squared error
12. What does the term "bagging" refer to in ensemble learning?
a. a) Training multiple models sequentially
b. b) Training multiple models in parallel and averaging their predictions
c. c) Reducing the number of features in a dataset
d. d) Combining models using a weighted average
13. What is the purpose of the sigmoid activation function in a neural network?
a. a) Introducing non-linearity
b. b) Regularizing model parameters
c. c) Calculating the mean squared error
d. d) Scaling input features
14. In a decision tree, what is the "Gini impurity" used for?
a. a) Measuring the variance of the target variable
b. b) Calculating the entropy of the target variable
c. c) Quantifying the purity of a node's class distribution
d. d) Assessing the correlation between features
15. Which algorithm is used for optimizing hyperparameters in machine learning
models?
a. a) Gradient descent
b. b) K-means
c. c) Grid search
d. d) Hierarchical clustering
16. What is the purpose of the L1 regularization term in linear regression?
a. a) Reducing bias in the model
b. b) Penalizing large coefficients
c. c) Increasing model complexity
d. d) Improving convergence of optimization algorithms
17. Which technique is used to prevent the "curse of dimensionality" in machine
learning?
a. a) Regularization
b. b) Feature scaling
c. c) Dimensionality reduction
d. d) Ensemble learning
18. What is the Kullback-Leibler (KL) divergence used for in probability theory?
a. a) Measuring the similarity between two probability distributions
b. b) Calculating the variance of a dataset
c. c) Evaluating the goodness of fit of a model
d. d) Assessing the linearity of a regression model
19. Which method is commonly used for imputing missing values in a dataset?
a. a) Removing rows with missing values
b. b) Filling missing values with the mean of the feature
c. c) Ignoring missing values during analysis
d. d) Replacing missing values with the mode of the feature
20. What is the purpose of the Viterbi algorithm in Hidden Markov Models (HMM)?
a. a) Calculating the likelihood of an observation sequence
b. b) Estimating the parameters of the model
c. c) Decoding the most likely sequence of hidden states
d. d) Smoothing noisy observations
21. Which technique is used to prevent overfitting in decision trees?
a. a) Pruning
b. b) Bagging
c. c) Boosting
d. d) Feature scaling
22. What is the goal of natural language processing (NLP)?
a. a) Simulating human intelligence
b. b) Generating random text
c. c) Reducing the dimensionality of text data
d. d) Extracting and understanding information from text
23. Which evaluation metric is appropriate for imbalanced multi-class classification
problems?
a. a) Accuracy
b. b) Precision-recall curve
c. c) F1-score
d. d) Mean squared error
24. What does the term "one-hot encoding" refer to in data preprocessing?
a. a) Converting categorical variables into numerical values
b. b) Combining multiple features into a single feature
c. c) Reducing the dimensionality of data
d. d) Transforming continuous variables into binary vectors
25. Which algorithm is used for reducing the dimensionality of high-dimensional data?
a. a) Naive Bayes
b. b) K-means clustering
c. c) Principal Component Analysis (PCA)
d. d) Random Forest
26. What is the purpose of the Jensen-Shannon divergence in probability theory?
a. a) Measuring the similarity between two probability distributions
b. b) Calculating the mean of a dataset
c. c) Estimating the variance of a distribution
d. d) Assessing the linearity of a regression model
27. Which method is used for text data preprocessing to remove unnecessary words and
reduce dimensionality?
a. a) One-hot encoding
b. b) Word embedding
c. c) Stopword removal
d. d) Lemmatization
28. What is the primary purpose of cross-validation in machine learning?
a. a) Training a model on all available data
b. b) Evaluating a model's performance on a separate dataset
c. c) Dividing data into training and testing sets
d. d) Visualizing the distribution of data
29. Which technique is used for reducing variance and improving the generalization of
an ensemble model?
a. a) Bagging
b. b) Boosting
c. c) Pruning
d. d) Regularization
30. In a support vector machine (SVM), what is the "kernel trick" used for?
a. a) Reducing model complexity
b. b) Adding new features to the dataset
c. c) Transforming data into a higher-dimensional space
d. d) Improving convergence of the optimization algorithm
31. What does the term "precision" refer to in binary classification?
a. a) The ratio of true positives to true negatives
b. b) The ratio of true positives to the sum of true positives and false positives
c. c) The ratio of true positives to the sum of true positives and false negatives
d. d) The ratio of true negatives to the sum of true negatives and false negatives
32. Which technique is used for generating new data samples using a trained model?
a. a) Clustering
b. b) Dimensionality reduction
c. c) Data augmentation
d. d) Regularization
33. What is the goal of feature scaling in machine learning?
a. a) Converting categorical features into numerical values
b. b) Balancing class distribution
c. c) Scaling numerical features to a similar range
d. d) Increasing the complexity of the model
34. What is the primary purpose of a confusion matrix in binary classification?
a. a) Evaluating the model's performance
b. b) Calculating the mean squared error
c. c) Identifying the number of features
d. d) Visualizing the data distribution
35. Which algorithm is used for extracting important features from text data?
a. a) Principal Component Analysis (PCA)
b. b) Linear Discriminant Analysis (LDA)
c. c) K-means clustering
d. d) Gradient Boosting
36. What does the term "bag of words" represent in natural language processing?
a. a) A technique for analyzing sentence structure
b. b) A method for encoding categorical variables
c. c) A model for sequence generation
d. d) A representation of text as a collection of word occurrences
37. Which method is used to mitigate the issue of multicollinearity in linear regression?
a. a) Feature scaling
b. b) L1 regularization
c. c) L2 regularization
d. d) Removing one of the correlated features
38. What is the primary purpose of a learning rate in gradient descent optimization?
a. a) Balancing the trade-off between bias and variance
b. b) Adjusting the number of iterations in training
c. c) Controlling the step size during parameter updates
d. d) Calculating the regularization term
39. Which technique is used for evaluating the importance of features in a random
forest model?
a. a) Gini impurity
b. b) Area Under the Curve (AUC)
c. c) Recursive Feature Elimination (RFE)
d. d) Mean squared error
40. What is the purpose of the log loss (binary cross-entropy) loss function in
classification?
a. a) Calculating the mean squared error
b. b) Minimizing the difference between predicted and actual values
c. c) Penalizing large model coefficients
d. d) Encouraging confident predictions and penalizing uncertainty
41. In time series forecasting, what is the role of the "lag" parameter?
a. a) Balancing class distribution
b. b) Specifying the number of clusters
c. c) Defining the number of previous time steps to consider
d. d) Determining the learning rate
42. Which technique is used for representing text data in a continuous vector space?
a. a) One-hot encoding
b. b) Word embedding
c. c) TF-IDF
d. d) Bag of words
43. What is the purpose of the Hessian matrix in optimization algorithms?
a. a) Calculating the gradient of the loss function
b. b) Regularizing model parameters
c. c) Determining the step size during optimization
d. d) Improving the convergence of gradient descent
44. Which algorithm is commonly used for sentiment analysis in text data?
a. a) Linear regression
b. b) Support Vector Machines (SVM)
c. c) Naive Bayes
d. d) Decision trees
45. What is the goal of the Expectation-Maximization (EM) algorithm?
a. a) Calculating the mean squared error
b. b) Training deep neural networks
c. c) Clustering data into groups
d. d) Optimizing hyperparameters
46. Which method is used for reducing variance in a model by averaging multiple
instances of it?
a. a) Regularization
b. b) Ensemble learning
c. c) Feature scaling
d. d) Dimensionality reduction
47. What is the purpose of the inverted dropout technique in neural networks?
a. a) Preventing overfitting by dropping out neurons during training
b. b) Scaling the input features to a similar range
c. c) Increasing model complexity by adding more layers
d. d) Introducing non-linearity
48. Which technique is used for finding the optimal number of clusters in K-means
clustering?
a. a) The Elbow method
b. b) Principal Component Analysis (PCA)
c. c) The Silhouette score
d. d) Regularization
49. What is the goal of gradient boosting in ensemble learning?
a. a) Increasing the variance of individual models
b. b) Training multiple models in parallel
c. c) Combining weak learners to create a strong model
d. d) Reducing the bias of the model
50. Which method is used for reducing the dimensionality of high-dimensional data
while preserving its variance?
a. a) Principal Component Analysis (PCA)
b. b) K-means clustering
c. c) Support Vector Machines (SVM)
d. d) Bagging

Answers:

1. b) Showing the relationship between two numerical variables


2. d) Capturing relevant information while reducing noise
3. a) Regularization
4. d) True positive rate vs. false positive rate
5. a) K-means
6. c) T-test
7. a) Measuring document similarity
8. d) Support Vector Machines
9. b) Data augmentation
10. c) F1-score
11. a) p-value
12. b) Training multiple models in parallel and averaging their predictions
13. a) Introducing non-linearity
14. c) Quantifying the purity of a node's class distribution
15. c) Grid search
16. b) Penalizing large coefficients
17. c) Dimensionality reduction
18. a) Measuring the similarity between two probability distributions
19. b) Filling missing values with the mean of the feature
20. c) Decoding the most likely sequence of hidden states
21. a) Pruning
22. d) Extracting and understanding information from text
23. c) F1-score
24. a) Converting categorical variables into numerical values
25. c) Principal Component Analysis (PCA)
26. a) Measuring the similarity between two probability distributions
27. c) Stopword removal
28. b) Evaluating a model's performance on a separate dataset
29. a) Bagging
30. c) Transforming data into a higher-dimensional space
31. b) The ratio of true positives to the sum of true positives and false positives
32. c) Data augmentation
33. c) Scaling numerical features to a similar range
34. a) Evaluating the model's performance
35. b) Linear Discriminant Analysis (LDA)
36. d) A representation of text as a collection of word occurrences
37. b) L1 regularization
38. c) Controlling the step size during parameter updates
39. a) Gini impurity
40. d) Encouraging confident predictions and penalizing uncertainty
41. c) Defining the number of previous time steps to consider
42. b) Word embedding
43. c) Determining the step size during optimization
44. c) Naive Bayes
45. c) Clustering data into groups
46. b) Ensemble learning
47. a) Preventing overfitting by dropping out neurons during training
48. a) The Elbow method
49. c) Combining weak learners to create a strong model
50. a) Principal Component Analysis (PCA)

You might also like