DAO2702 - Sample Exam
DAO2702 - Sample Exam
DAO2702 - Sample Exam
INSTRUCTIONS TO STUDENTS
1. Please write your student number only. Do not write your name.
6. Write your answers in the boxes provided after each question (or part of a question).
You should plan your answers to ensure that they fit within the spaces provided. Other than this
cover page and the spaces designated for providing your answers, you may do your “rough
work” anywhere. Whatever you write outside of the answer boxes may be ignored.
Student No :
PART I 63
PART II 37
Total 100
2 DAO2702
a = ‘2.5’
b = “1”
print(str(a*2) + str(b))
A. 2.52.51
B. 6
C. 51
D. An error message
a = 3
b = 4
A. True
B. False
C. 3
D. 4
letters = 'AaBbCcDdEeFfGg'
new_str=''
for letter in letters:
if letter in 'ABCD':
continue
new_str = new_str + letter
if letter in 'efg':
break
print(new_str)
A. abcde
B. abcDdEe
C. abcdEe
D. abcdE
3 DAO2702
while True:
index = name.find(' ')
if index == -1:
break
name = name[index+1:]
print(name)
A. Kennedy
B. Fitzgerald Kennedy
C. JFK
D. An error message
list_1 = [1, 2]
list_2 = list_1
list_1.insert(1, '0')
list_2 += list_1
print(list_2)
print(sentence[-8:-3])
A. analyt
B. nalyt
C. analyti
D. nalyti
list1 = [1, 2, 3]
list2, list3 = list1, list1[1:]
list2.pop()
print(list2+list3)
4 DAO2702
A. [1, 2, 2]
B. [1, 2, 2, 3]
C. [1, 2, 3, 2, 3]
D. [1, 2, 3, 1, 2, 3]
a = a + b
b = a - b
a = a - b
A. a, b = 2.5, 3.9
a, b = b, a
B. a, b = 2.5, 3.9
a, b = a + b, a - b
C. a, b = 2.5, 3.9
a, b = a - b, a – b
D.
a, b = 2.5, 3.9
b, a = a, a - b
10. Given a list “var” of variance measures, which of the following code is correct in
transforming the variances to standard deviations?
D. import numpy as np
A. 34
B. 35
C. 45
D. An error message
12. The function “pie” from “matplotlib.pyplot” is used to draw pie graphs.
import matplotlib.pyplot as plt
help(plt.pie)
import numpy as np
A. 0
B. array([7, 0])
C. array([6, 0])
7 DAO2702
D. array([7, 3])
14. Consider a regression model ^y = β^ 0 + β^ 1 x + β^ 2 x + ^β 3 x + ^β 4 x , where the model
2 3 4
import numpy as np
Given an array "x" containing five different values for the variable x ,
x = np.arange(5)
which of the following code is correct in calculating the predicted value ^y for each
x value? The results should also be given in an array "y".
A. y = 0
for n in range(5):
y = y + b[n]*x**n
B. y = 0
for n in range(5):
y = y + b[n]*x[n]**n
C. y = b * x**np.arange(5)
D. y = sum(b * x**np.arange(5))
15. The table below shows, for credit card holders with one to three cards, the joined
probabilities for the number of cards owned (X) and number of credit purchases
made in a week (Y).
import numpy as np
Which of the following code is correct in calculating the variance of "the number of
purchases in a week"? The variance is represented by the variable var.
8 DAO2702
A. Y = np.arange(5)
mean = (probs*Y).sum(axis=0)
var = (probs.sum(axis=0)*(Y - mean)**2).sum()
B. Y = np.arange(5)
mean = (probs.sum(axis=0)*Y).sum()
var = (probs.sum(axis=0)*(Y - mean)**2).sum()
C. Y = np.arange(5)
mean = (probs.sum(axis=0)*Y).sum()
var = (probs.sum()*(Y - mean)**2).sum(axis=0)
D. Y = np.arange(5)
mean = (probs.sum(axis=1)*Y).sum()
var = (probs.sum(axis=1)*(Y - mean)**2).sum(axis=0)
17. Suppose that the amount of a soft drink that goes into a typical 12-ounce can varies
from can to can. It follows a normal distribution with the mean value to be 12
ounces and a fixed standard deviation to be 0.05 ounce. How to calculate the
probability that the amount of the soft drink is between 11.92 ounces to 12.12
ounces.
The sample size is “n”, and the sample mean is "x_bar". The population standard
deviation is known to be "sigma". How to calculate the lower bound of the
confidence interval of the population mean?
Which of the following code gives the median of the attribute “wage”?
A. wage_summary = wage_data.describe()
print(wage_summary['wage']['median'])
B. wage_summary = wage_data.describe()
print(wage_summary['median']['wage'])
C. wage_summary = wage_data.describe()
print((wage_summary['wage']['min']+wage_summary['wage']['max'])/2)
D. wage_summary = wage_data.describe()
print((wage_summary['wage']['50%']))
20. Which of the following models can be viewed as a linear regression model?
(1) ^y =β 0 + β 1 x
β1
(2) √ ^y = + β2 x
2
x
(3) ^y =β 0 + β 1x
(4) log ^y = β0 + β 1 log x
A. (1), (2), and (3)
B. (1), (2), and (4)
C. (1) and (2)
D. Only (1)
We toss a dice 65 times, and the results are given in the following table.
12 7 11 9 15 11 5 marks
E [ X ]= × 1+ ×2+ ×3+ × 4+ ×5+ ×6=3.6307
65 65 65 65 65 65
(b) What is the sample proportion of the event that the tossed number is 4?
9 5 marks
^p= =0.138
65
(c) Write down the Python code to calculate confidence interval of the population
proportion of "4" as the tossed number.
# Use ps to denote the sample proportion of 4 8 marks
# Use lower to represent the lower bound
# Use upper to represent the upper bound
n = 65
alpha = 0.05
lower = ps-norm.ppf(1-alpha/2)*(ps*(1-ps)/n)**0.5
upper = ps+norm.ppf(1-alpha/2)*(ps*(1-ps)/n)**0.5
12 DAO2702
(d) Based on the given data, can we conclude that the proportion of having "4" as the
tossed number is different from 1/6? Answer this question using the hypothesis
testing (P-value), with the significance level to be α =0.05
H 0 : p=1/6 5 marks
H a : p ≠ 1/6
n = 65
z_value = (ps - (1/6)) / ((1/6)*(1-1/6)/n)**0.5
p_value = 2*(1 - norm.cdf(abs(z_value)))
What is the interpretation of the P-value (how the P-value affects your
conclusion)?