sp2014 Midterm
sp2014 Midterm
Instructions
Please answer clearly and succinctly. If an explanation is requested, think carefully before
writing. Points may be removed for rambling answers. If a question is unclear or ambiguous,
feel free to make the additional assumptions necessary to produce the answer. State these
assumptions clearly; you will be graded on the basis of the assumption as well as subsequent
reasoning. On multiple choice questions, incorrect answers will incur negative points
proportional to the number of choices. For example a 1 point true-false question will receive 1
point if correct, -1 if incorrect, and zero if left blank. Only make informed guesses.
1. (1 point) Who are you? Write your name at the top of every page.
3. (1 point each – total of 13) True / False Circle the correct answer.
(a) T F Iterative deepening search is guaranteed to expand more nodes than breadth-
first search (on any graph whose root is not the goal).
(b) T F A* search with a heuristic that is not completely admissible may still find the
shortest path to the goal state.
(c) T F Consider a finite, acyclic search space where depth-first search is guaranteed
to eventually find a solution and the root is not a goal. In this situation
iterative deepening search will always explore more nodes than depth-first.
1
Name: _________________________________________ UW CSE 473 Midterm, Fall 2014
(d) T F A pattern database helps an agent avoid wasting time in cycles by storing
previously-expanded states.
(e) T F Random restarts are often used in local search to diminish the problem of
local maxima.
(f) T F Doubling your computer's speed allows you to double the depth of a tree
search given the same amount of time
(g) T F Every CSP with higher order constraints can be rewritten as a binary CSP with
the same number of variables.
(h) T F If a binary CSP has a tree-structured constraint graph, we can find a satisfying
assignment (or prove no satisfying assignment exists) in time that is linear in
the number of variables.
(i) T F Backtracking search on CSPs, while generally much faster than general
purpose search algorithms like A*, still requires exponential time in the worst
case.
(j) T F One reason to use forward checking in a CSP problem is in order to detect
failures quickly and backtrack earlier.
(k) T F An agent that uses Minimax search, which assumes an enemy behaves
optimally, may well achieve a better score when playing against a suboptimal
enemy than the agent would against an optimal enemy.
(l) T F All other things being equal, value iteration will converge in fewer iterations,
when the discount factor, gamma, is smaller.
(m) T F Expectimax search can be used to solve an MDP in a finite horizon setting.
(n) T F The optimal policy for an MDP depends on the MDP’s start state.
2
Name: _________________________________________ UW CSE 473 Midterm, Fall 2014
____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____
____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____
3
Name: _________________________________________ UW CSE 473 Midterm, Fall 2014
e) (4 points) Solve the (reduced, arc-consistent) CSP using backtracking search (without
forward checking). Use the minimum remaining values (MRV) variable ordering (breaking
ties in numerical order), and least constraining value (LCV) value ordering (breaking ties in
alpha-numerical). What order are the first variables assigned, and what values are they
given?
The first variable assigned is _______ it’s given value _________
4
Name: _________________________________________ UW CSE 473 Midterm, Fall 2014
6 (9 points) MDPs. Consider a setting where every 6 months Apple decides whether to
release a new version of the iPhone or not. Assume the problem can be represented as an
MDP with states (G=Good, M = Mediocre, B = Bad),
To G To M To B
each referring to the public sentiment towards Apple.
From G Take R 0.1 0.9 0.0
The actions are R=Release, D = Don’t release.
From G Take D 0.2 0.8 0.0
Taking an action that lands in state G (from any
From M Take R 0.1 0.9 0.0
other state, including itself) receives reward 2,
From M Take D 0.0 0.3 0.7
landing in state M receives reward 0, and landing in
From B Take R 0.9 0.0 0.1
state B receives reward -1. The discount factor (γ)
From B Take D 0.0 0.5 0.5
is 1. The transitions are as shown in the table:
a) For this MDP, fill in the blank spaces in the value iteration table (1 point for each Q entry)
G M B
V0(state) 0 0 0
Q1(state, R)
Q1(state, D)
V1(state)