DAO2702

DAO2702
NATIONAL UNIVERSITY OF SINGAPORE
DAO2702 PROGRAMMING FOR BUSINESS ANALYTICS

(Sample Exam)
Time Allowed: 2 Hours
INSTRUCTIONS TO STUDENTS
1. Please write your student number only. Do not write your name.
2. This assessment paper contains 21 multiple-choice questions and 1 quantitative question,

totally 12 printed pages.
3. Students are required to answer ALL questions.
4. This is a CLOSED BOOK examination (with authorised materials).
5. Specific permitted materials is one (1) A4-sized help sheet.
6. Write your answers in the boxes provided after each question (or part of a question).
You should plan your answers to ensure that they fit within the spaces provided. Other than this
cover page and the spaces designated for providing your answers, you may do your “rough
work” anywhere. Whatever you write outside of the answer boxes may be ignored.
Student No :
Question Max Mark
PART I 63
PART II 37
Total 100
2 DAO2702
PART I. MULTIPLE-CHOICE QUESTIONS (63 marks)

There is only one correct answer in each question. Please shade your answer in the
bubble card given. Each question accounts for 3 marks.
1. What is the output message of the following code?
a = ‘2.5’
b = “1”
print(str(a*2) + str(b))
A. 2.52.51
B. 6
C. 51
D. An error message
a = 3
b = 4
statement1 = a <= 3 and a > 2

statement2 = b > 2 and b <= 3
print(statement1 != statement2)
A. True
B. False
C. 3
D. 4
3. What is the printed message of the following code?
letters = 'AaBbCcDdEeFfGg'
new_str=''
for letter in letters:
if letter in 'ABCD':
continue
new_str = new_str + letter
if letter in 'efg':
break
print(new_str)
A. abcde
B. abcDdEe
C. abcdEe
D. abcdE
3 DAO2702
4. What is the printed message of the following code?

name = 'John Fitzgerald Kennedy'
while True:
index = name.find(' ')
if index == -1:
break
name = name[index+1:]
print(name)
A. Kennedy
B. Fitzgerald Kennedy
C. JFK
D. An error message
list_1 = [1, 2]
list_2 = list_1
list_1.insert(1, '0')
list_2 += list_1
print(list_2)
A. [ ‘0’, 1, 2, ‘0’, 1, 2, ‘0’, 1, 2]

B. [1, ‘0’, 2, 1, ‘0’, 2, 1, ‘0’, 2]
C. [1, ‘0’, 2, 1, 2, 1, 2]
D. [1, ‘0’, 2, 1, ‘0’, 2]
sentence = "Business analytics"
print(sentence[-8:-3])
A. analyt
B. nalyt
C. analyti
D. nalyti
list1 = [1, 2, 3]
list2, list3 = list1, list1[1:]
list2.pop()
print(list2+list3)
4 DAO2702
A. [1, 2, 2]
B. [1, 2, 2, 3]
C. [1, 2, 3, 2, 3]
D. [1, 2, 3, 1, 2, 3]
list_1 = [True, "2.0", 3.5]

list_2 = list_1 * 2
list_1.remove(3.5)
list_2.remove("2.0")
print(list_1 + list_2)
A. [True, '2.0', True, 3.5, True, '2.0', 3.5]

B. [True, True, True]
C. [True, '2.0', True, 3.5, True, 3.5]
D. [True, '2.0', True, True]
9. The code below

a, b = 2.5, 3.9
a = a + b
b = a - b
a = a - b
is equivalent to which of the following code?
A. a, b = 2.5, 3.9
a, b = b, a
B. a, b = 2.5, 3.9
a, b = a + b, a - b
C. a, b = 2.5, 3.9
a, b = a - b, a – b
D.
a, b = 2.5, 3.9
b, a = a, a - b
10. Given a list “var” of variance measures, which of the following code is correct in
transforming the variances to standard deviations?
A. var = [3.5, 6.2, 7.3, 8.5]

std = var ** 0.5
5 DAO2702
B. var = [3.5, 6.2, 7.3, 8.5]

std = []
for item in var:
std = std + item ** 0.5
C. var = [3.5, 6.2, 7.3, 8.5]

std = []
for item in var:
std.append([item ** 0.5])
D. import numpy as np
var = [3.5, 6.2, 7.3, 8.5]

std = np.array(var) ** 0.5
std = list(std)
def sum_two(x, y, z):

"""This is a function"""
return x+y, y+z, x+z
"""Outside the function"""

x = 1
y = 2
z = 3
result = sum_two(x, y, z)
a, b = result[::2]
print(str(a) + str(b))
A. 34
B. 35
C. 45
D. An error message
12. The function “pie” from “matplotlib.pyplot” is used to draw pie graphs.
import matplotlib.pyplot as plt
help(plt.pie)
Help on function pie in module matplotlib.pyplot:
pie(x, explode=None, labels=None, colors=None, autopct=None,

pctdistance=0.6, shadow=False, labeldistance=1.1, startangle=None,
radius=None, counterclock=True, wedgeprops=None, textprops=None,
center=(0, 0), frame=False, rotatelabels=False, *, data=None)
……
Parameters
----------
x : array-like
The wedge sizes.
6 DAO2702
explode : array-like, optional, default: None

If not *None*, is a ``len(x)`` array which specifies the fraction
of the radius with which to offset each wedge.
labels : list, optional, default: None

A sequence of strings providing the labels for each wedge
colors : array-like, optional, default: None

A sequence of matplotlib color args through which the pie chart
will cycle. If *None*, will use the colors in the currently
active cycle.
……
Which of the following code is correct in drawing a pie without shadows, where the
wedge sizes are specified by the list "grades", the labels are specified by the list
"labels", and the explode distance is [0.1]*7
A. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

labels = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C']
plt.pie(x=grades, [0.1]*7, shadow=False, labels=labels)

plt.show()
B. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

labels = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C']
plt.pie(labels=labels, grades, [0.1]*7, shadow=False)

plt.show()
C. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

labels = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C']
plt.pie(labels=labels, x=grades, explode=[0.1]*7)

plt.show()
D. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

labels = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C']
plt.pie(grades, [0.1]*7, labels, False)

plt.show()
import numpy as np
an_array = np.array([[1, 2],

[3, 4]])
result = an_array.sum() - an_array.sum(axis=1)
A. 0
B. array([7, 0])
C. array([6, 0])
7 DAO2702
D. array([7, 3])
14. Consider a regression model ^y = β^ 0 + β^ 1 x + β^ 2 x + ^β 3 x + ^β 4 x , where the model
2 3 4
coefficients ^β i are given in a numpy array "b".
import numpy as np
b = np.array([20.55, 3.10, 2.71, 0.50, 1.24])
Given an array "x" containing five different values for the variable x ,
x = np.arange(5)
which of the following code is correct in calculating the predicted value ^y for each
x value? The results should also be given in an array "y".
A. y = 0
for n in range(5):
y = y + b[n]*x**n
B. y = 0
for n in range(5):
y = y + b[n]*x[n]**n
C. y = b * x**np.arange(5)
D. y = sum(b * x**np.arange(5))
15. The table below shows, for credit card holders with one to three cards, the joined
probabilities for the number of cards owned (X) and number of credit purchases
made in a week (Y).
Such a table is given as an array below.
import numpy as np
probs = np.array([[0.08, 0.13, 0.09, 0.06, 0.03],

[0.03, 0.08, 0.08, 0.09, 0.07],
[0.01, 0.03, 0.06, 0.08, 0.08]])
Which of the following code is correct in calculating the variance of "the number of
purchases in a week"? The variance is represented by the variable var.
8 DAO2702
A. Y = np.arange(5)
mean = (probs*Y).sum(axis=0)
var = (probs.sum(axis=0)*(Y - mean)**2).sum()
B. Y = np.arange(5)
mean = (probs.sum(axis=0)*Y).sum()
var = (probs.sum(axis=0)*(Y - mean)**2).sum()
C. Y = np.arange(5)
var = (probs.sum()*(Y - mean)**2).sum(axis=0)
D. Y = np.arange(5)
var = (probs.sum(axis=1)*(Y - mean)**2).sum(axis=0)
16. In the case study of "Adult persistence of head-turning asymmetry", it is assumed

that the population proportion of adults turning to the right when they are kissing
is p=0.4. We are using the binomial distribution "binom" from "scipy.stats" to
calculate the probability of having at least 10 couples in a 20-case sample turning
to the right, which of the following function can be used for the calculation?
A. 1 - binom.ppf(9, 20, 0.4)
B. binom.cdf(10, 20, 0.4)
C. 1 - binom.cdf(9, 20, 0.4)
D. binom.ppf(10, 20, 0.4)
17. Suppose that the amount of a soft drink that goes into a typical 12-ounce can varies
from can to can. It follows a normal distribution with the mean value to be 12
ounces and a fixed standard deviation to be 0.05 ounce. How to calculate the
probability that the amount of the soft drink is between 11.92 ounces to 12.12
ounces.
A. from scipy.stats import norm
prob = norm.cdf(11.92, 12, 0.05) - (1-norm.cdf(12.12, 12, 0.05))
B. from scipy.stats import norm
prob = norm.cdf(12.12, 12, 0.05) - norm.cdf(11.92, 12, 0.05)
C. from scipy.stats import norm

9 DAO2702
prob = 2 - norm.cdf(12.12, 12, 0.05) - norm.cdf(11.92, 12, 0.05))
D. from scipy.stats import norm
prob = 1 - norm.cdf(12.12, 12, 0.05) + norm.cdf(11.92, 12, 0.05)
18. Given the parameters below,
from scipy.stats import norm
alpha = 0.05 # The confidence level 1 - alpha

sigma = 1 # Population standard deviation
n = 20 # Sample size n
x_bar = 20 # Sample mean
se = sigma / n**0.5
The sample size is “n”, and the sample mean is "x_bar". The population standard
deviation is known to be "sigma". How to calculate the lower bound of the
confidence interval of the population mean?
A. from scipy.stats import norm
lower = x_bar - norm.ppf(1-alpha/2)*se
B. from scipy.stats import norm
lower = x_bar - norm.ppf((1-alpha)/2)*se
C. from scipy.stats import norm
lower = x_bar + norm.ppf(-alpha/2)*se
D. from scipy.stats import norm
lower = x_bar + norm.ppf((1-alpha)/2)*se
19. Given a data table "wage_data" below,

10 DAO2702
Which of the following code gives the median of the attribute “wage”?
A. wage_summary = wage_data.describe()
print(wage_summary['wage']['median'])
B. wage_summary = wage_data.describe()
print(wage_summary['median']['wage'])
C. wage_summary = wage_data.describe()
print((wage_summary['wage']['min']+wage_summary['wage']['max'])/2)
D. wage_summary = wage_data.describe()
print((wage_summary['wage']['50%']))
20. Which of the following models can be viewed as a linear regression model?
(1) ^y =β 0 + β 1 x
β1
(2) √ ^y = + β2 x
2
x
(3) ^y =β 0 + β 1x
(4) log ^y = β0 + β 1 log x
A. (1), (2), and (3)
B. (1), (2), and (4)
C. (1) and (2)
D. Only (1)
21. Which of the following statements on multiple regression model is correct?

A. If the slope parameter β j is positive, then x j is positively correlated with the
dependent variable y .
B. If x j is positively correlated with the dependent variable y , then the slope
parameter β j is positive.
C. If β j is larger than β k , then we can conclude that x j has a greater impact on the
dependent variable y , compared with x k .
D. All above are incorrect
11 DAO2702
PART II. QUANTITATIVE QUESTIONS (37 marks)
We toss a dice 65 times, and the results are given in the following table.
Tossed Number Number of Appearances

1 12
2 7
3 11
4 9
5 15
6 11
(a) What is the sample mean of the tossed number?
12 7 11 9 15 11 5 marks
E [ X ]= × 1+ ×2+ ×3+ × 4+ ×5+ ×6=3.6307
65 65 65 65 65 65
(b) What is the sample proportion of the event that the tossed number is 4?
9 5 marks
^p= =0.138
65
(c) Write down the Python code to calculate confidence interval of the population
proportion of "4" as the tossed number.
# Use ps to denote the sample proportion of 4 8 marks
# Use lower to represent the lower bound
# Use upper to represent the upper bound
n = 65
alpha = 0.05
lower = ps-norm.ppf(1-alpha/2)*(ps*(1-ps)/n)**0.5
upper = ps+norm.ppf(1-alpha/2)*(ps*(1-ps)/n)**0.5
12 DAO2702
(d) Based on the given data, can we conclude that the proportion of having "4" as the
tossed number is different from 1/6? Answer this question using the hypothesis
testing (P-value), with the significance level to be α =0.05
The null and alternative hypotheses:
H 0 : p=1/6 5 marks
H a : p ≠ 1/6
Python code to calculate the P-value:
# Use ps to denote the sample proportion of 4 9 marks

# Use p_value to represent the P-value
n = 65
z_value = (ps - (1/6)) / ((1/6)*(1-1/6)/n)**0.5
p_value = 2*(1 - norm.cdf(abs(z_value)))
What is the interpretation of the P-value (how the P-value affects your
conclusion)?
If P-value is larger than alpha, there is insufficient evidence to reject 5 marks

the null hypothesis under the given significance level.
If P-value is lower than alpha, we reject the null hypothesis in favor
of the alternative hypothesis, which implies that the proportion of
having "4" as the tossed number is different from 1/6.
Total Score for Part II

DAO2702 - Sample Exam

Uploaded by

Copyright:

Available Formats

DAO2702 - Sample Exam

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DAO2702 - Sample Exam

Uploaded by

Copyright:

Available Formats

NATIONAL UNIVERSITY OF SINGAPORE

DAO2702 PROGRAMMING FOR BUSINESS ANALYTICS

Time Allowed: 2 Hours

2. This assessment paper contains 21 multiple-choice questions and 1 quantitative question,

3. Students are required to answer ALL questions.

4. This is a CLOSED BOOK examination (with authorised materials).

5. Specific permitted materials is one (1) A4-sized help sheet.

Question Max Mark

PART I. MULTIPLE-CHOICE QUESTIONS (63 marks)

1. What is the output message of the following code?

2. What is the output message of the following code?

statement1 = a <= 3 and a > 2

3. What is the printed message of the following code?

4. What is the printed message of the following code?

5. What is the output message of the following code?

A. [ ‘0’, 1, 2, ‘0’, 1, 2, ‘0’, 1, 2]

6. What is the output message of the following code?

sentence = "Business analytics"

7. What is the output message of the following code?

8. What is the output message of the following code?

list_1 = [True, "2.0", 3.5]

A. [True, '2.0', True, 3.5, True, '2.0', 3.5]

9. The code below

is equivalent to which of the following code?

A. var = [3.5, 6.2, 7.3, 8.5]

B. var = [3.5, 6.2, 7.3, 8.5]

C. var = [3.5, 6.2, 7.3, 8.5]

var = [3.5, 6.2, 7.3, 8.5]

11. What is the output message of the following code?

def sum_two(x, y, z):

return x+y, y+z, x+z

"""Outside the function"""

Help on function pie in module matplotlib.pyplot:

pie(x, explode=None, labels=None, colors=None, autopct=None,

explode : array-like, optional, default: None

labels : list, optional, default: None

colors : array-like, optional, default: None

A. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

plt.pie(x=grades, [0.1]*7, shadow=False, labels=labels)

B. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

plt.pie(labels=labels, grades, [0.1]*7, shadow=False)

C. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

plt.pie(labels=labels, x=grades, explode=[0.1]*7)

D. grades = [0.02, 0.08, 0.15, 0.25, 0.2, 0.15, 0.15]

plt.pie(grades, [0.1]*7, labels, False)

13. What is the output message of the following code?

an_array = np.array([[1, 2],

coefficients ^β i are given in a numpy array "b".

b = np.array([20.55, 3.10, 2.71, 0.50, 1.24])

Such a table is given as an array below.

probs = np.array([[0.08, 0.13, 0.09, 0.06, 0.03],

16. In the case study of "Adult persistence of head-turning asymmetry", it is assumed

A. 1 - binom.ppf(9, 20, 0.4)

B. binom.cdf(10, 20, 0.4)

C. 1 - binom.cdf(9, 20, 0.4)

D. binom.ppf(10, 20, 0.4)

A. from scipy.stats import norm

prob = norm.cdf(11.92, 12, 0.05) - (1-norm.cdf(12.12, 12, 0.05))