Mid Term Solutions
Mid Term Solutions
Mid Term Solutions
Enrolment No:
SECTION A
1. Each Question will carry 8 Marks
S. Marks CO
No.
Q1 A machine learning professor wants to use the number of hours a student studies for a machine learning
final exam score (Y). A regression model is fit based on data collected from a class during the previous
semester, with the following results: Yi=35.0 + 3Xi. What is the interpretation of the Y-intercept b0 and
slope b1?
Ans. : Y-intercept b0=35 indicates that when the student does not study for the final exam the predicted
score is 35. The slope b1=3 indicates that for each increase of one hour in studying time, the predicted
change in final exam score is +3.
In a nutshell, the final exam score is predicted to increase by a mean of 3 points for each one-hour
increase in studying time.
What is machine learning? What are abstraction and generalization in the context of Machine Learning?
2+6 CO1
Ans. Machine learning is a branch of AI and computer science which focuses on the use of
data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
Abstraction: The choice of model is typically not left up to the machine. Instead, the learning
task and data on hand inform model selection. The process of fitting a model to a dataset is
known as training. When the model has been trained, the data is transformed into an abstract
form that summarizes the original information.
Generalization:The term generalization describes the process of turning abstracted knowledge
into a form that can be utilized for future action, on tasks that are similar, but not identical, to
those it has seen before.
Q2 What are the metrics to be considered to evaluate any machine learning algorithm? You can explain this with
linear regression as a model taking it into consideration.
Ans. : Accuracy, Precision, Recall F-score.
For simple linear Regression :
R Square, Adjusted R, Standard error,
Measures of Variations : 8 CO1
SST(Total sum of squares)
SSR(Regression sum of squares)
SSE(Error sum of squares)
Ans. Any appropriate diagram having training, model and evaluation phases can be 8 CO1
considered.
8 CO2
Fig. 1
Refer to fig.1 (above), which type of function is it? What is the intuition behind using this function? Which type
of regression problem can be solved using this function?
Ans. (Hints) Sigmoid activation function used to solve classification problems (Logistic
Regression)…………………..Equation of logistic regression is
SECTION B
1. Each question will carry 15 marks.
Q6 i) Discuss in detail the market basket model. What are its applications? Suppose a database has five transactions
(refer to table).Compute support and confidence of the following transactions(association buying) :
O=>N
O=>K
ii)Give the definition of support and confidence.
Transaction ID Items bought
1 {M,O,N,K,E,Y}
2 {D,O,N,K,E,Y}
3 {M,A,K,E}
4 {M,U,C,K,Y}
5 {C,O,O,K,I,E}
Ans:
One of the most important methods used by major retailers to identify associations between products is market
basket analysis. 15 CO3
It operates by looking for product combinations that regularly appear together in transactions.
To put it another way, it enables businesses to discover connections between the products that customers
purchase.
In order to uncover strong rules found in transaction data using measures of interestingness based on the idea of
strong rules, association rules are frequently employed to analyse market basket model on transaction data.
Q7
Two regression equations are shown above. Discuss the types of regressions and their significance in real-life
house price prediction problems. What are y, b0, b1, x1…..xn in the above equations, and how they are related to 15 CO2
your housing price prediction problem?
Ans. The above equation is simple linear Regression and bottom one is multiple linear regression. X1, X2….Xn
these are the features to predict house price and b1,b2,…..bn etc. are weightage of the different features. This
concept can be used to describe the entire answer.
Refer to the above three (regression) diagrams a, b, and c. What are the possible values of multiple r(coefficient of correlation) for all the
Ans; Multiple r (Coefficient of correlation) is the relationship between dependent and independent variable..It is generally measured in te
In the above diagrams : a) Positive correlation r= +1 b) Negative Correlation r=-1 and c) No correlation r=0
The table below shows the data (X=independent Variable, Y= dependent variable). Use this data
to compute r(coefficient of correlation)
to compute a(Y intercept) and b(slope) to fit a regression line Y=a+ bX.
X 1 2 3 4
Y 3 4 6 8
Ans. To compute r (coefficient of correlation): The linear correlation coefficient defines the degree of relation between two v
between two quantities.
If x & y are the two variables of discussion, then the correlation coefficient can be calculated using the formula.
Using the below mentioned formulae we compute the values of a and b. And finally line equation will be as follow
y=1 + 1.7x