Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS 188

Fall 2022

Introduction to Artificial Intelligence

Midterm

Q1. [22 pts] Potpourri

(a) Evgeny has an arbitrary search problem. Assume all costs are positive.

(i) [1 pt] If he runs uniform cost graph search on the problem, he is guaranteed to get the optimal solution. O  True     O  False

(ii) [1 pt] Jason takes the problem and designs a heuristic function h such that for all states s, h(s) > 0. If he runs greedy

graph search on the problem using the heuristic, he is guaranteed to get the optimal solution. O  True     O  False

(iii) [1 pt] Regardless of your answer for the previous parts, suppose both uniform cost graph search and greedy graph

search return the same optimal path. Jason claims that using his heuristic, A* graph search is guaranteed to return the optimal path for Evgeny’s search problem. Is he correct?

O  Yes, because the priority value for A* search on a given state is the sum of priority values for UCS and greedy search.

O  Yes, but not for the reason above.

O  No, because if the heuristic is not consistent, then A* graph search is not guaranteed to be optimal. O  No, but not for the reason above.

(b) [2 pts] In a CSP, what could happen to the domain of an arbitrary variable after enforcing arc consistency? Select all that

apply.

□ The domain could remain unchanged.

□ The domain size could be smaller than before (but not empty).

□ The domain could be empty. O  None of the above

(c) Consider a general CSP with n variables of domain size d . We would like to use a search tree to solve the CSP, where all the complete assignments (and thus all the solutions) are leaf nodes at depth n.

(i) [2 pts] How many possible complete assignments are there for this CSP?

O O(n2 d3 )

O O(nd)

O O(n2 )

O O(n2 d)

(ii) [2 pts] How many leaves does the search tree have?

O O(n2 ⋅ dn )

O O(d! ⋅ nd )

O O(d2 ⋅ nd )

O

O

O

O

O

O

O(nd )

O(dn )

None of the above

O(n! dn )

Same as the answer to the previous subpart None of the above

(d) Suppose you are playing a zero-sum game in which the opponent plays optimally. There are no chance nodes in the game tree.

(i) [1 pt] If you want to maximize your utility when playing against this opponent, which algorithm would be the best fit? Assume all the algorithms don’t have a limit on search depth.

O O

Minimax

Expectimax

O

O

Monte Carlo Tree Search

Not enough information

(ii) [2 pts] Select all true statements about pruning game trees.

□ Using alpha-beta pruning allows you to choose an action with a higher value compared to not using alpha-beta pruning.

□ Alpha-beta pruning ensures that all the nodes in the game tree have their correct values.

□ Alpha-beta pruning will always prune at least one node/leaf in a minimax game tree.

□ Alpha-beta pruning does not work on expectimax game trees with unbounded values. O  None of the above

(e) [2 pts] Select all true statements about alpha-beta pruning.

□ Alpha-beta pruning affects the action selected at the root node.

□ The order in which nodes are pruned affects the number of nodes explored.

□ The order in which nodes are pruned affects the action selected at the root node.

□ In most cases, pruning reduces the number of nodes explored. O  None of the above

(f) [2 pts] Select all true statements about minimax, minimax with alpha-beta pruning, and Monte Carlo tree search (MCTS).

□ Among the three search algorithms, only MCTS is a sample-based search algorithm.

□ Neither minimax with alpha-beta pruning nor MCTS will ever search the tree exhaustively (i.e. visit every node in the search tree).

□ Minimax is an exhaustive search algorithm (i.e. it visits every node in the search tree).

□ When the game tree is small, minimax can explore fewer nodes than minimax with alpha-beta pruning. O  None of the above

(g) The following graph defines an MDP with deterministic transitions. Each transition produces a constant reward r = 1.

The discount factor is   = 0.5. The game ends once the state E is reached (there are no actions available from E).

Note: The formula for the sum of an infinite geometric series is xi = 1 + x + x2 + x3 + ... =

(i) [1 pt] What is V (D), the optimal value at state D?

(ii) [1 pt] What is V (A), the optimal value at state A?

(iii) [1 pt] How many iteration(s) are needed for policy iteration to converge in the worst case (for any initial policy 0)? In other words, select the minimum k such that k = for the worst 0.

3

∞ (never converges in the worst case) None of the above

3

∞ (never converges)

None of the above

(h) [2 pts] Select all true statements about reinforcement learning.

□ Direct Evaluation requires knowing the transition function of the underlying MDP.

□ TD Learning itself does not learn the optimal value/policy.

□ The optimal Q-value Q(s, a) is the expected discounted reward for following the optimal policy starting at state s.

□ Q-learning can learn the optimal policy using only transition data from random policies. O  None of the above

Q2. [17 pts] Haunted House

Mr. and Mrs. Pacman have found themselves inside of a haunted house with H floors.  Each floor is a rectangular grid of dimension M × N . Mr. Pacman and Mrs. Pacman have been separated, and must find each other. There is one staircase per floor, all located on the same square in the M × N grid. At each timestep, Mr. Pacman or Mrs. Pacman can move North, East, South, West, Up, or Down (they can take Up or Down only if they are at a staircase). However, Mr. Pacman can only travel L levels of stairs before he has to rest for W turns. Mr. and Mrs. Pacman alternate turns.

(a) (i) [1 pt] What is the maximum branching factor?

(ii) [2 pts] Assume that all actions have a cost of 1, except for Up and Down, which have a cost of 2. Which of the

following search algorithms will be guaranteed to return the solution with the fewest number of actions? Select all that apply.

□ Depth-First Search

□ Breadth-First Search

□ Uniform-Cost Search O  None of the above

(iii) [6 pts] For a(iii) and a(iv) only, assume there are K knights that Mr. and Mrs. Pacman have to avoid, located

throughout the house. Since the knights are wearing heavy armor, they cannot move. Fill in the blanks such that S evaluates to the minimal state space size. Each blank can include numbers (including 0) or variables in the problem

statement.

S = A ⋅ 6BMCNDHEKF

A =

B =

C =

D =

E =

F =

(iv) [2 pts] Assume that all actions have a cost of 1, except for Up and Down, which have a cost of 2. Which of the

following search algorithms will be guaranteed to return the solution with the fewest number of actions? Select all that apply.

□ Depth-First Search

□ Breadth-First Search

□ Uniform-Cost Search O  None of the above

(b) For each modification to the original problem, write the term X such that the new minimal state space size Sis XS .

Each subpart below is independent (the modifications are not combined).

(i) [2 pts] Mr. and Mrs. Pacman discover two elevators that start on floor 1 and floor H respectively. On each turn, each elevator moves one level up and down, respectively. When an elevator reaches the top or bottom floor, it reverses direction. This motion is repeated for all time steps.

(ii) [2 pts] Mr. and Mrs. Pacman find out that there are indistinguishable ghosts inside the house. There are G ghosts,

where G is much smaller than MNH . The ghosts move all at once randomly after each turn, and there can only be one ghost in a single square.

(iii) [2 pts] Same as the previous subpart, but now there are more ghosts.

Specifically, G > MNH . (You don’t need this expression to solve this problem.)

Q3. [16 pts] Rocket Science

We are in a spaceship, orbiting around a planet. The actions available in this search problem are listed below:

Orbit 2

Orbit 0

Planet

Decelerate Slow

For all subparts, assume that we are performing A* search, and a solution exists from any state. Select whether the provided heuristic is admissible and whether the provided heuristic is consistent, for any choice of k > 0.

(a) Suppose our goal is a specific orbit.

(i) [2 pts] h1 = min # of actions needed to go to the goal orbit if only Slow actions are allowed

Admissible

(ii) [2 pts] h2 = h1 ⋅ (5 + k)

Admissible

(iii) [2 pts] h3 = min [h1 (5 + k), m],

□ Admissible

Consistent

Consistent

m =

□ Consistent

O  Neither

O  Neither

if h1 is odd

otherwise

O  Neither

(iv) [2 pts] Which of the following heuristics will find the optimal solution the fastest?

O h1

O h2

O h3

O

above

None of the

(b) Now, suppose there are many planets in space. Each planet can have a different number of orbits. Our goal is a specific

orbit of a specific planet.

Now, the only actions available are {Accelerate Slow, Decelerate Slow, Transition }. The spaceship can only transition to a different planet from the current planet’s outermost orbit, and it will land in the outermost orbit of the destination planet. The Transition action has a cost of 5 + 2k.

(i) [2 pts] h4 =

Admissible Consistent                                 O  Neither

(ii) [2 pts] h5 = h4 ⋅ (min # of actions needed to go to the outermost orbit of the current planet)

Admissible Consistent                                 O  Neither

(iii) [2 pts] h6 = (1 − h4 ) ⋅ (min # of actions needed to go to the goal orbit)

Admissible Consistent                                 O  Neither

(iv) [2 pts] h7 = min(h5 , h6 )

Q4. [13 pts] Golden Bear Years

Pacman will be a freshman at University of Perkeley, so he is planning his 4 years in college. He loves playing video games, so he will only choose courses from the computer science department: C 1, C2,, C 8.

In this problem, the variables are the 8 courses, and their domains are the 4 years to take them. The constraints are listed below:

C 1 and C2 should be taken in Year 1.

C 3 should be taken in Year 3.

C5 should be taken in Year 4.

C5 and C6 should be taken in the same year.

C5 and C7 should be taken in adjacent years, but it doesn’t matter which course is taken first.

C7 and C 8 shouldn’t be taken in the same year.

(a) [1 pt] How many binary constraint(s) are there in the above problem?

O  0     O  1     O  2     O  3     O  4     O  5

(b) [8 pts] Pacman has to take all 8 courses in 4 years, so he decides to run backtracking search with arc consistency. Select

the values in the domains that will be removed after enforcing unary constraints and arc consistency. If no values are removed from the domain, select“No values removed”(write N).

(1) C 1:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(2) C2:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(3) C 3:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(4) C4:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(5) C5:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(6) C6:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(7) C7:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(8) C 8:  □ 1     □ 2     □ 3     □ 4         O  N No values removed

(c) [2 pts] Now, suppose University of Perkeley has provided n total CS courses (n is much larger than 8). Pacman doesn’t worry about graduation, so he can spend d years at the school (d is much larger than 4). After taking the prerequisite courses (C 1 to C 8), the school has no other constraints on the courses Pacman can take. Now, what will be the time complexity of the AC-3 arc consistency algorithm?

O  The time complexity is O(n2d3) and cannot be further reduced.

O  The time complexity is generally O(n2d3), and can be reduced to O(n2d2).

O  The time complexity can be less than O(n2d2).

(d) [2 pts] Suppose that Pacman is running local search to find a solution to this CSP, and currently has the following variable assignment. For the next iteration, he wants to change the assignment of one variable, which will lead to the fewest number of unsatisfied constraints remaining. Fill in the blank for the variable and the value it should change to.

C 1 ∶ 3 C5 ∶ 4

C2 ∶ 2

C6 ∶ 1

C 3 ∶ 4

C7 ∶ 2

C4 ∶ 4

C 8 ∶ 4

The variable:

will be assigned to the value:

Q5. [8 pts] Games

(a) Consider the following game tree.

(i)

Level 2

Level 3

Level 4

Level 5

[1 pt] What

is the minimax value at node A?

(ii) [3 pts] Which branches will be pruned after running minimax search with alpha-beta pruning? For instance, if the edge between node A and node B is pruned, write A B . If the edge between H and 3 is pruned, write H − 3. List the pruned branches from left to right. If a branch from an upper level is pruned, you don’t have to list the branches below that.

(iii) [2 pts] Suppose all the min nodes in layer 4 are changed to max nodes and all the max nodes in layer 3 are changed to

min nodes. In other words, we have max, min, min, max as levels 1, 2, 3, 4 in the game tree. We then run minimax search with alpha-beta pruning.

Which of the following statements is true?

O  The same set of leaf nodes will be pruned, because this is still a minimax problem and running alpha-beta pruning will result in the same set of leaf nodes to be pruned.

O  A different set of leaf nodes will be pruned.

O No leaf nodes can be pruned, because pruning is only possible when the minimizer and maximizer alternate at each level.

O  None of the above

(iv) [2 pts] Now, consider another game tree which has the same structure as the original game tree shown above. This

modified game tree can take on any values at the leaf nodes. What is the minimum and the maximum number of leaf nodes that can be pruned after running alpha-beta pruning?

Minimum leaf nodes pruned:                            Maximum leaf nodes pruned:

Q6. [12 pts] Indiana Jones & the Kingdom of the Crystal Skull

Help Indiana Jones find the crystal skull that was hidden somewhere at the old frozen lake!

For all parts of the problem, assume that value iteration begins with all states initialized to zero, and Y = 1.

(a) Suppose that we are performing value iteration on the grid world MDP below. Indiana starts at the bottom-left part of the

lake labeled 1.

Indiana found an ancient manuscript detailing the rules of this MDP. (All of these rules are also described in the transition table below.)

• Indiana can only go up or right, unless the action would cause Indiana to move off the board or into the black square.

• Because the lake is frozen, if Indiana moves up from Square 3, it’s possible (with probability x) that Indiana slips and actually moves two squares up.

• From Squares 1, 2, and 4, Indiana deterministically moves one square in the chosen direction.

• Once Indiana reaches Square 5, the only action available is Exit, which gives reward 100.  (There are no actions available after exiting.)

• Square 2 contains ancient money, so any action that moves Indiana into Square 2 gives reward +5.

• All other transitions give reward −4 if Indiana moves two squares, and −10 if Indiana moves one square.

s

a

s

T(s, a, s)

R(s, a, s)

1

Up

2

1.0

+5

1

Right

3

1.0