0% found this document useful (0 votes)
57 views12 pages

ML June 2024

This document is a question paper for a Machine Learning course with a duration of 3 hours and a maximum mark of 75. It includes compulsory and optional sections covering various topics such as confusion matrices, regression analysis, clustering algorithms, and classification methods. The paper requires students to demonstrate their understanding of machine learning concepts through problem-solving and theoretical questions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
57 views12 pages

ML June 2024

This document is a question paper for a Machine Learning course with a duration of 3 hours and a maximum mark of 75. It includes compulsory and optional sections covering various topics such as confusion matrices, regression analysis, clustering algorithms, and classification methods. The paper requires students to demonstrate their understanding of machine learning concepts through problem-solving and theoretical questions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 12
Sr. No. of Question Paper : Unique Paper Code Name of the Paper Name of the Course Semester | Duration : 3 Hours [This question paper contains 12 printed pages.] 2 (076570071 Your Roll NO.sseeeeseeee 3143 H 32347607 Machine Learning B.Sc. (H) Computer Science ADMISSIONS OF 2019, 2020 & 2021 VI Maximum Marks : 75 Instructions for Candidates 1. Write your Roll No. on the top immediately on receipt of this question paper. 2. Section A is compulsory. 3. Attempt any 4 questions from Section B. 4. Use of scientific calculator is allowed 3143 2 Section A (Compulsory) ie posite a scenario where 6000 patients are tested for Covid positive. Out of which 5000 are actually Covid negative and 1000 are actually Covid positive. For covid positive patients the test however gave positive indication for 700 only and for covid negative patients, the test gave positive indication for 200 patients. Construct a confusion matrix for above scenario and find the values of True Positive Rate (TPR), False Positive Rate (FPR), Specificity, Sensitivity metrics. (5) (b) Answer the following : © (i) What is the impact of small dataset with Tespect to large number of features? Gf Fos the given values theta_0=0.2, theta_1=0.1, and theta_2=0.1; predict values of dependent variable y for all 3 instances of independent variables x1 and x2 as given in following data table using linear regression. Also predict mean squared error, Cluster the following set of data objects in two clusters by applying one iteration of k-means algorithm, Treat objects 2 and 5 as initial cluster centres. Use Euclidean distance as the distance metric, Determine updated cluster centre coordinates. (5) Object X-coordinate | Y-coordinate Number a 2 a 6 {2 —— = | 3 6 8 a | 5 2 4 P.T.0O0. >, 3143 4 (d) Differentiate between linear regression and polynomial regression. Derive the gradient descent algorithm to find the unknown parameters in multivariate linear regression. (5) cohow PCA (Principal Component Analysis) algorithm helps in dimension reduction in machine learning? Write the steps of PCA algorithm. 1) What is regularization? Write equations of cost function for regularized linear and regularized logistic regression, What will be the effect on model when the regularization parameter is set to zero? (5) ce/fonsies the following dataset with 8 training : instances. Use k-NN algorithm (for k=3) to determine ‘the ‘Result’ stgtus for a new, test instance with values CGPA = 7.6, Assessment = 60 and Project Points = 7. () Section - B 2. (a) Consider two features in a dataset and their possible values as shown below: (4) + Income: values (medium, low, high, very high) + Status: values (SO, AO, Clerk) Answer the following questions (i) Using Cartesian product on above : ature set, construct a new fe; ature and Benerate its possible values list, PT.o . FF, 3143 6 (ii) State one advantage and one disadvantage of above approach for feature construction. (b) For the given set of points, identify clusters using complete linkage in agglomerative clustering. Use Euclidean distance to calculate the distance between two points. (6) Points_| X coordinate 3. (a) Consider the following two dimensional space with some data points such that circle points represent Positive class points and triangular points represent negative class points separated by a decision boundary as shown, i) Answer the following questions : (i) Identify support vectors, (with respect to SVM classifier applied on above data) (ii) Draw marginal planes, (with respect to SVM classifier applied on above data) (iii) Define Marginal Distance in SVM algorithm. ral network for a two input NOR (b) Construct neu! gate using truth table. Show diagram for your enerated neural Bi (5) network model with weights 3143 ae (a) Apply Naive Bayesian Classifier to Predict whether a car is stolen or not with features {Color:RED, Origin:Domestic, Typer:SUV} based on given dataset. (5) [Color Type Origin Stolen 42 5 SPORTS DOMESTIC, | YES« 4 RED « SPORTS DOMESTIC _|NO ==— [RED ‘SPORTS. DOMESTIC__| YES # ‘YELLOW SPORTS. DOMESTIC |NO_—— YELLOW _ | SPORTS IMPORTED __| YES YELLOW _|SUV* IMPORTED _|NO_ ==> YELLOW __| SUV IMPORTED __| YES + YELLOW _| SUV DOMESTIC _|NO_—— RED SUV IMPORTED _| NO-—— RED SPORTS. IMPORTED __| YES + b) Differentia nethod, leave one (b) Tentiate between hold out m out method and k-fold method for cross-validation. Which of the above methods has low bias and high variance, Justify.« 6) 3143 : 9 (a) Using the data given below, build a logistic regression model to predict whether a student is pass or fail based on exam score using gradient descent algorithm. Assume initial values for model parameters (thetas) as 0 and learning rate as 0.3. Use one iterations of gradient descent algorithm to update the model parameters. (6) Exam Score (x) __| Pass/Fail (y) 0 0 hoe 1 99 _t (b) Using least squares method, learn the regression evefficients for the data given below, Also predict the value of Y for x=12 using your learned coefficients (4) PTO 3143 10 x Y 2 21 4 27 6 29 8 4 10 86 For given input values of x1 and x2 as 0.3 and 0.5 Fespectively, determine the values of output nodes yl and y2, Use bias bI=0.5 and b2=0.5. Use sigmoid as the Activation function for hidden as well as output layer. () 0 3143 unt (b) Explain the effect of following factors in achieving model convergence with respect to’ gradient descent algorithm. + Learning rate is too small, + Learning rate is too large. (3) a (pf Consider following training data for § Persons. Por binary classification of a person ay sick or not sick create a decision tree model. Show all the steps. (8) (b) Consider the expected and predicted outcomes of a machine learning classifier on a data set containing 7 observations Calculate the PTO 3143 12 performance of the classifier using Jaccard Index metric. Q) Yexpecied [0 [0 pee et [center fae Ypredicied [1 [0 0 1 0 ies

You might also like