0% found this document useful (0 votes)
7 views3 pages

Exam - HND

Uploaded by

jebamass20021011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

Exam - HND

Uploaded by

jebamass20021011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Section A: Multiple Choice Questions (20 marks)

1. What is the primary goal of supervised learning?


○ a) Find patterns in unlabelled data
○ b) Predict output from input data
○ c) Reduce the size of the dataset
○ d) Automate decision-making

2. What is the main characteristic of structured data?


○ a) Lacks a predefined format
○ b) Organized into rows and columns
○ c) Composed of multimedia content
○ d) Cannot be stored in databases

3. Which of the following is NOT a method of data collection?


○ a) Surveys
○ b) Web scraping
○ c) Cloud storage
○ d) Interviews

4. What is the primary goal of exploratory data analysis?


○ a) Model evaluation
○ b) Data visualization and summarization
○ c) Predictive modeling
○ d) Cloud deployment

5. In supervised learning, the model is trained using:


○ a) Only unlabeled data
○ b) Both labeled and unlabeled data
○ c) Only labeled data
○ d) Neither labeled nor unlabeled data

6. Which algorithm is best suited for classification tasks?


○ a) Linear Regression
○ b) K-Nearest Neighbors
○ c) K-Means Clustering
○ d) PCA
7. Which tool is commonly used for querying relational databases?
○ a) Excel
○ b) Python
○ c) SQL
○ d) Tableau

8. In reinforcement learning, what does an agent learn?


○ a) Supervised rules
○ b) Optimal actions through trial and error
○ c) Clustering algorithms
○ d) Training data labels

9. What type of chart is most appropriate for visualizing the relationship between two
continuous variables?
○ a) Bar chart
○ b) Pie chart
○ c) Scatter plot
○ d) Line chart

10. Which cloud platform is widely used for storing large-scale data?
○ a) AWS
○ b) WordPress
○ c) GitHub
○ d) Canva

Section B: Short Answer Questions (30 marks)

1. Define machine learning and briefly explain its types. (3 marks)


2. Explain the difference between primary and secondary data with examples. (3 marks)
3. Describe the steps involved in the data analytics lifecycle.(3 marks)
4. Differences between structured and unstructured data. (3 marks)
5. What are the steps involved in data preprocessing? Provide examples. (3 marks)
6. What are the advantages of cloud storage for big data? (3 marks)
7. Briefly explain the concept of supervised learning and provide one real-world
example. (3 marks)
8. Describe the difference between regression and classification problems. (3 marks)
9. How do you handle missing data in a dataset? Provide at least two approaches. (3
marks)
10. Why do we use Linear Regression with examples? (3 marks)
Section C: Problem Solving/Case Studies (50 marks)

Case Study 1: Predictive Analytics (25 marks)

You are given a dataset containing features like Study Hours, Attendance, and Past
Performance, and a target column Result (1 for pass, 0 for fail).

1. Explain how you would preprocess this dataset before applying a machine learning
model. (5 marks)
2. Select a suitable algorithm for this task and justify your choice. (5 marks)
3. Write down the steps for splitting the data into training and testing sets and training
the model. (5 marks)
4. Propose metrics to evaluate the model's performance and explain their importance.
(5 marks)
5. Discuss how you could improve the model's accuracy. (5 marks)

Case Study 2: Data Clustering (25 marks)

A company wants to segment its customers based on their purchasing behavior. You have
features like Purchase Amount, Frequency, and Age.

1. Describe the purpose of clustering in this scenario. (5 marks)


2. Which algorithm would you use for clustering, and why? (5 marks)
3. Outline the steps to preprocess the data for clustering. (5 marks)
4. How would you determine the optimal number of clusters? (5 marks)
5. Explain how the results of clustering can help the company. (5 marks)

You might also like