0% found this document useful (0 votes)
9 views6 pages

Midterm 2 S24

The document is an exam for S24 Midterm 2 consisting of multiple choice questions and programming problems, totaling 65 points. It covers topics such as functions, DataFrames, regular expressions, statistical tests, binary trees, and clustering algorithms. The exam includes both theoretical questions and practical coding tasks related to class definitions and distance calculations.

Uploaded by

lujainm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Midterm 2 S24

The document is an exam for S24 Midterm 2 consisting of multiple choice questions and programming problems, totaling 65 points. It covers topics such as functions, DataFrames, regular expressions, statistical tests, binary trees, and clustering algorithms. The exam includes both theoretical questions and practical coding tasks related to class definitions and distance calculations.

Uploaded by

lujainm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

S24 Midterm 2 - version 1 (65 pts total) Name:_______________________

Part I: Multiple Choice (40 pts)

Earlier material (12 pts)

1. What can this function possibly return, given di erent possible arguments?
def is_positive(a):
if a > 0:
return True
elif a < 0:
return False

A. The function may return True or False, depending on a.


B. The function always returns the tuple (True, False).
C. The function may return True, False, or None, depending on a.
D. The function returns either (True, None) or (False, None), depending on a.

2. What does the following code print?


total = 0
for i in range(3):
for j in range(3):
total += j
print(total)

A. 3
B. 9
C. 18
D.
0
1
3
3
4
6
6
7
9
ff
3. Which statement about a DataFrame is false?

A. The head() method displays the rst few lines of a DataFrame.


B. A single column of a DataFrame has the type Series.
C. All cells in a DataFrame must share the same type.
D. Assuming 'val' is the name of a column of integers in the DataFrame df,
df['val'] > 0 evaluates to a Series of boolean values that are True wherever
column 'val' was positive.

Since the last midterm (28 pts)

4. Which regular expression operator means "at least one of the preceding symbol"?
A. +
B. *
C. ?
D. _

5. We have a dataset comparing 100-meter-dash times for competitors who use


protein shakes (group A) and those who never drink them (group B). Using a t-test, we
get a p-value of 0.3. What can we conclude?

A. We reject the null hypothesis that the two groups have the same mean time.
B. We can't reject the null hypothesis that the two groups have the same mean time.
C. We reject the null hypothesis that the two groups are di erent.
D. Nothing, because we should have used a chi-square test to check whether these
groups are di erent.

6. Under what circumstances would we be most likely to use the "with" keyword?

A. Opening a le for reading.


B. Scanning a column of a DataFrame for a particular value.
C. Getting input from the user with input().
D. Beginning a while loop.

7. What does it mean to "override" a method?

A. It means removing an inherited method from the class, so it can't be called.


B. It means providing a di erent de nition for an inherited method.
C. It means to call a method while ignoring any underscores that suggest it should not
be called.
D. It means providing a method of the same name as an existing method in the class,
but with a di erent number of arguments from the existing de nition.
ff
ff
fi
ff
fi
fi
ff
fi
8. What is the shape of the binary tree created by calling
mystery(mystery(BinaryTree(1)))?

class BinaryTree:
def __init__(self, val):
self.val = val
self.left = None
self.right = None

def mystery(tree):
if left:
mystery(tree.left)
else:
tree.left = BinaryTree(1)
if right:
mystery(tree.right)
else:
tree.right = BinaryTree(1)
return tree

A. It has 3 nodes - the root and its two children.


B. It has 7 nodes - the root, its two children, and its four grandchildren.
C. It has 3 nodes - the root, its left child, and that node's left child.
D. This code runs forever and creates an in nite tree.

9. In a train/test split performed for supervised machine learning, how are the examples
typically divided between the train and test sets?

A. Training set gets the "harder" problems, test set gets the "easier" problems.
B. Training set gets the "easier" problems, test set gets the "harder" problems.
C. Training set gets the beginning and middle of the data, and test set gets the end of
the data.
D. The test set is randomly sampled from all the data.

10. In a typical embedding used in natural language processing, where vectors


represent word meanings, which of these is the most reliable signal that two words
have a similar meaning?

A. Sine of their angle is close to 1.


B. Cosine of their angle is close to 1.
D. Negative sine of their angle is close to 1.
C. Vectors have similar length.
fi
Part II: Programming (25 points)

Problem 1 (15 points)

De ne a class for Cluster objects with the following methods:


[constructor] - Just needs to take a list of points with (x,y) tuple coordinates, and
store them in an attribute named points.

centroid(): Return, as a two element tuple, the average of all stored x


coordinates and the average of all stored y coordinates.

merge(): Take as an argument another Cluster object, and updates the current
cluster to include both sets of points. (Returns None.)

You should use what you know and the problem description to determine an
appropriate name for the constructor and what arguments these methods need.
fi
Problem 2 - (10 points)

Assume your Cluster implementation from the rst problem works. Write a function
assign(point, c1, c2) that takes a point p (an (x,y) tuple) and adds it to either the
points list of cluster c1 or the points list of cluster c2, whichever has a closer centroid.
(Return None.) Assume you have access to a function my_dist(p1, p2) that can
calculate distances between (x,y) tuples p1 and p2.

fi
Additional scratch space

You might also like