Quiz on Machine Learning, AI, and
Statistical Modeling
Statistical Modeling in Python
1. Which Python library is primarily used for statistical modeling?
A) NumPy
B) pandas
C) scikit-learn
D) statsmodels
Answer: D) statsmodels
2. What is the purpose of the fit() method in scikit-learn?
A) Data visualization
B) Model evaluation
C) Model training
D) Data preprocessing
Answer: C) Model training
3. Which statistical model is used for predicting continuous outcomes?
A) Linear Regression
B) Logistic Regression
C) Decision Trees
D) Clustering
Answer: A) Linear Regression
4. How do you interpret the coefficient of determination (R-squared) in linear regression?
A) Measure of model accuracy
B) Measure of model complexity
C) Measure of goodness of fit
D) Measure of prediction error
Answer: C) Measure of goodness of fit
Supervised Learning
5. Which supervised learning algorithm is suitable for binary classification problems?
A) Support Vector Machines (SVM)
B) Random Forest
C) Gradient Boosting
D) All of the above
Answer: D) All of the above
6. What is the primary goal of supervised learning?
A) Predicting continuous outcomes
B) Classifying categorical outcomes
C) Identifying patterns in data
D) Reducing dimensionality
Answer: B) Classifying categorical outcomes
7. How do you handle imbalanced datasets in supervised learning?
A) Oversampling the minority class
B) Undersampling the majority class
C) Using class weights
D) All of the above
Answer: D) All of the above
8. Which evaluation metric is commonly used for classification problems?
A) Mean Squared Error (MSE)
B) Accuracy
C) Precision
D) Recall
Answer: B) Accuracy
Unsupervised Machine Learning
9. Which unsupervised learning algorithm is used for clustering?
A) K-Means
B) Hierarchical Clustering
C) DBSCAN
D) All of the above
Answer: D) All of the above
10. What is the primary goal of unsupervised learning?
A) Predicting continuous outcomes
B) Classifying categorical outcomes
C) Identifying patterns in data
D) Reducing dimensionality
Answer: C) Identifying patterns in data
11. Which technique is used for dimensionality reduction?
A) Principal Component Analysis (PCA)
B) t-SNE
C) Autoencoders
D) All of the above
Answer: D) All of the above
12. How do you evaluate the quality of clusters in unsupervised learning?
A) Using silhouette score
B) Using Calinski-Harabasz index
C) Using Davies-Bouldin index
D) All of the above
Answer: D) All of the above
Data Collection
13. What is primary data collection?
A) Collecting data from existing sources
B) Collecting data through surveys and experiments
C) Collecting data from social media
D) Collecting data from sensors
Answer: B) Collecting data through surveys and experiments
14. Which secondary data source is commonly used in research?
A) Government reports
B) Academic journals
C) Social media
D) Online forums
Answer: A) Government reports
15. What is the advantage of using secondary data?
A) Cost-effectiveness
B) Time-saving
C) Increased accuracy
D) All of the above
Answer: D) All of the above
16. How do you ensure data quality during data collection?
A) Data validation
B) Data cleaning
C) Data transformation
D) All of the above
Answer: D) All of the above
Machine Learning and AI
17. Which machine learning algorithm is suitable for image classification?
A) Convolutional Neural Networks (CNN)
B) Recurrent Neural Networks (RNN)
C) Long Short-Term Memory (LSTM)
D) Support Vector Machines (SVM)
Answer: A) Convolutional Neural Networks (CNN)
18. What is the purpose of the activation function in neural networks?
A) To introduce non-linearity
B) To reduce dimensionality
C) To increase complexity
D) To improve accuracy
Answer: A) To introduce non-linearity
19. Which AI technique is used for natural language processing?
A) Deep learning
B) Machine learning
C) Rule-based systems
D) All of the above
Answer: D) All of the above
20. What is the primary goal of machine learning?
A) Predicting continuous outcomes
B) Classifying categorical outcomes
C) Identifying patterns in data
D) Improving decision-making
Answer: D) Improving decision-making