Minimax strategies, alpha beta pruning. Lirong Xia

Similar documents
Announcements. CS 188: Artificial Intelligence Fall Adversarial Games. Computing Minimax Values. Evaluation Functions. Recap: Resource Limits

CS 188: Artificial Intelligence Fall Announcements

CS 188: Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games

CS 4100 // artificial intelligence. Recap/midterm review!

Game playing. Chapter 6. Chapter 6 1

CSE 573: Artificial Intelligence

IS-ZC444: ARTIFICIAL INTELLIGENCE

Algorithms for Playing and Solving games*

CS 188 Introduction to Fall 2007 Artificial Intelligence Midterm

CS221 Practice Midterm

Introduction to Fall 2008 Artificial Intelligence Midterm Exam

Adversarial Search & Logic and Reasoning

Evaluation for Pacman. CS 188: Artificial Intelligence Fall Iterative Deepening. α-β Pruning Example. α-β Pruning Pseudocode.

CS 188: Artificial Intelligence Fall 2008

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Mock Exam Künstliche Intelligenz-1. Different problems test different skills and knowledge, so do not get stuck on one problem.

Introduction to Spring 2009 Artificial Intelligence Midterm Exam

CITS4211 Mid-semester test 2011

Scout, NegaScout and Proof-Number Search

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 17th May 2018 Time: 09:45-11:45. Please answer all Questions.

Adversarial Search. Christos Papaloukas, Iosif Angelidis. University of Athens November 2017

Constraint Satisfaction Problems

Finding optimal configurations ( combinatorial optimization)

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1

Summary. Agenda. Games. Intelligence opponents in game. Expected Value Expected Max Algorithm Minimax Algorithm Alpha-beta Pruning Simultaneous Game

Alpha-Beta Pruning: Algorithm and Analysis

POLYNOMIAL SPACE QSAT. Games. Polynomial space cont d

Scout and NegaScout. Tsan-sheng Hsu.

Introduction to Spring 2009 Artificial Intelligence Midterm Solutions

Introduction to Fall 2011 Artificial Intelligence Final Exam

Final. CS 188 Fall Introduction to Artificial Intelligence

CSE 473: Artificial Intelligence Spring 2014

Summary. Local Search and cycle cut set. Local Search Strategies for CSP. Greedy Local Search. Local Search. and Cycle Cut Set. Strategies for CSP

Introduction to Arti Intelligence

CSC242: Intro to AI. Lecture 7 Games of Imperfect Knowledge & Constraint Satisfaction

Chapter 6 Constraint Satisfaction Problems

CS 188 Fall Introduction to Artificial Intelligence Midterm 1

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Properties of Forward Pruning in Game-Tree Search

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Integer Programming. Wolfram Wiesemann. December 6, 2007

CMSC 474, Game Theory

Final. CS 188 Fall Introduction to Artificial Intelligence

An Analysis of Forward Pruning. to try to understand why programs have been unable to. pruning more eectively. 1

Alpha-Beta Pruning: Algorithm and Analysis

Final exam of ECE 457 Applied Artificial Intelligence for the Spring term 2007.

Alpha-Beta Pruning: Algorithm and Analysis

Space and Nondeterminism

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

CS 662 Sample Midterm

Alpha-Beta Pruning Under Partial Orders

Written examination: Solution suggestions TIN175/DIT411, Introduction to Artificial Intelligence

Constraint satisfaction search. Combinatorial optimization search.

Introduction to computation. Lirong Xia

Solution Set 4, Fall 12

Appendix. Mathematical Theorems

This observation leads to the following algorithm.

algorithms Alpha-Beta Pruning and Althöfer s Pathology-Free Negamax Algorithm Algorithms 2012, 5, ; doi: /a

Scheduling with AND/OR Precedence Constraints

Learning in State-Space Reinforcement Learning CIS 32

Midterm. Introduction to Artificial Intelligence. CS 188 Summer You have approximately 2 hours and 50 minutes.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Name: UW CSE 473 Final Exam, Fall 2014

CSC304 Lecture 5. Game Theory : Zero-Sum Games, The Minimax Theorem. CSC304 - Nisarg Shah 1

Speculative Parallelism in Cilk++

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

Andrew/CS ID: Midterm Solutions, Fall 2006

Impartial Games. Lemma: In any finite impartial game G, either Player 1 has a winning strategy, or Player 2 has.

CS 570: Machine Learning Seminar. Fall 2016

Artificial Intelligence

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius

Essentials of the A* Algorithm

23. Cutting planes and branch & bound

Introduction to Combinatorial Game Theory

Introduction to Spring 2006 Artificial Intelligence Practice Final

Learning in Depth-First Search: A Unified Approach to Heuristic Search in Deterministic, Non-Deterministic, Probabilistic, and Game Tree Settings

Exam EDAF May 2011, , Vic1. Thore Husfeldt

Sampling from Bayes Nets

Gestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA

Local search and agents

CS188: Artificial Intelligence, Fall 2009 Written 2: MDPs, RL, and Probability

Deep Reinforcement Learning

Midterm Examination CS540: Introduction to Artificial Intelligence

Linear Programming Duality

Final exam of ECE 457 Applied Artificial Intelligence for the Fall term 2007.

Lecture 23: More PSPACE-Complete, Randomized Complexity

CS221 Practice Midterm #2 Solutions

CMU Lecture 4: Informed Search. Teacher: Gianni A. Di Caro

Rollout-based Game-tree Search Outprunes Traditional Alpha-beta

Solving Zero-Sum Extensive-Form Games. Branislav Bošanský AE4M36MAS, Fall 2013, Lecture 6

CS 347 Parallel and Distributed Data Processing

CS Algorithms and Complexity

MIRA, SVM, k-nn. Lirong Xia

I may not have gone where I intended to go, but I think I have ended up where I needed to be. Douglas Adams

Rutgers Business School, Introduction to Probability, 26:960:575. Homework 1. Prepared by Daniel Pirutinsky. Feburary 1st, 2016

CSL302/612 Artificial Intelligence End-Semester Exam 120 Minutes

FINAL EXAM CHEAT SHEET/STUDY GUIDE. You can use this as a study guide. You will also be able to use it on the Final Exam on

Q-learning Tutorial. CSC411 Geoffrey Roeder. Slides Adapted from lecture: Rich Zemel, Raquel Urtasun, Sanja Fidler, Nitish Srivastava

Transcription:

Minimax strategies, alpha beta pruning Lirong Xia

Reminder ØProject 1 due tonight Makes sure you DO NOT SEE ERROR: Summation of parsed points does not match ØProject 2 due in two weeks 2

How to find good heuristics? ØNo really mechanical way art more than science ØGeneral guideline: relaxing constraints e.g. Pacman can pass through the walls ØMimic what you would do 3

Arc Consistency of a CSP Ø A simple form of propagation makes sure all arcs are consistent: X X X Delete Ø If V loses a value, neighbors of V need to be rechecked! from tail! Ø Arc consistency detects failure earlier than forward checking Ø Can be run as a preprocessor or after each assignment Ø Might be time-consuming 4

Limitations of Arc Consistency ØAfter running arc consistency: Can have one solution left Can have multiple solutions left Can have no solutions left (and not know it) 5

Sum to 2 game Ø Player 1 moves, then player 2, finally player 1 again Ø Move = 0 or 1 Ø Player 1 wins if and only if all moves together sum to 2 Player 1 0 1 Player 2 Player 2 0 1 0 1 Player 1 Player 1 Player 1 Player 1 0 1 0 1 0 1 0 1-1 -1-1 1-1 1 1-1 Player 1 s utility is in the leaves; player 2 s utility is the negative of this

Today s schedule ØAdversarial game ØMinimax search ØAlpha-beta pruning algorithm 7

Adversarial Games Ø Deterministic, zero-sum games: Tic-tac-toe, chess, checkers The MAX player maximizes result The MIN player minimizes result Ø Minimax search: A search tree Players alternate turns Each node has a minimax value: best achievable utility against a rational adversary 8

Computing Minimax Values Ø This is DFS Ø Two recursive functions: max-value maxes the values of successors min-value mins the values of successors Ø Def value (state): If the state is a terminal state: return the state s utility If the agent at the state is MAX: return max-value(state) If the agent at the state is MIN: return min-value(state) Ø Def max-value(state): Initialize max = - For each successor of state: Compute value(successor) Update max accordingly return max Ø Def min-value(state): similar to max-value 9

Minimax Example 3 3 2 2 10

Tic-tac-toe Game Tree 11

Renju 15*15 5 horizontal, vertical, or diagonal in a row win no double-3 or double-4 moves for black otherwise black s winning strategy was computed L. Victor Allis 1994 (PhD thesis) 12

Ø Time complexity? ( m ) Ob Ø Space complexity? Minimax Properties Obm ( ) Ø For chess, Exact solution is completely infeasible b 35, m 100 But, do we need to explore the whole tree? 13

Ø Cannot search to leaves Ø Depth-limited search Resource Limits Instead, search a limited depth of tree Replace terminal utilities with an evaluation function for non-terminal positions Ø Guarantee of optimal play is gone 14

Evaluation Functions Ø Functions which scores non-terminals Ø Ideal function: returns the minimax utility of the position Ø In practice: typically weighted linear sum of features: Ø e.g. Evals s ( ) = w 1 f ( 1 s) + w 2 f ( 2 s) + + w n f ( n s) f ( s ) = (# white queens - # black queens) 1, etc. 15

Minimax with limited depth ØSuppose you are the MAX player ØGiven a depth d and current state ØCompute value(state, d) that reaches depth d at depth d, use a evaluation function to estimate the value if it is non-terminal 16

Improving minimax: pruning 17

Pruning in Minimax Search ØAn ancestor is a MAX node already has an option than my current solution my future solution can only be smaller 18

Alpha-beta pruning Ø Pruning = cutting off parts of the search tree (because you realize you don t need to look at them) When we considered A* we also pruned large parts of the search tree Ø Maintain α = value of the best option for the MAX player along the path so far β = value of the best option for the MIN player along the path so far Initialized to be α = - and β = + Ø Maintain and update α and β for each node α is updated at MAX player s nodes β is updated at MIN player s nodes

Alpha-Beta Pruning Ø General configuration We re computing the MIN-VALUE at n We re looping over n s children n s value estimate is dropping α is the best value that MAX can get at any choice point along the current path If n becomes worse than α, MAX will avoid it, so can stop considering n s other children Define β similarly for MIN α is usually smaller than β Once α >= β, return to the upper layer 20

Alpha-Beta Pruning Example α β is MAX s best alternative here or above is MIN s best alternative here or above 21

Alpha-Beta Pruning Example starting α/ β α = - β = + raising α α = - β = + α = - β = + α = 3 β = + α = 3 β = + lowering β α = - β = + α = - β = 3 α = - β = 3 α = - β = 3 α = 3 β = + α = 3 β = 2 α = 3 β = + α = 3 β = 14 α = 3 β = 5 α = 3 β = 1 raising α α = - β = 3 α = 8 β = 3 α β is MAX s best alternative here or above is MIN s best alternative here or above 22

Alpha-Beta Pseudocode 23

Alpha-Beta Pruning Properties Ø This pruning has no effect on final result at the root Ø Values of intermediate nodes might be wrong! Important: children of the root may have the wrong value Ø Good children ordering improves effectiveness of pruning Ø With perfect ordering : Time complexity drops to O(b m/2 ) Doubles solvable depth! Your action looks smarter: more forward-looking with good evaluation function Full search of, e.g. chess, is still hopeless 24

Project 2 ØQ1: write an evaluation function for (state,action) pairs the evaluation function is for this question only ØQ2: minimax search with arbitrary depth and multiple MIN players (ghosts) evaluation function on states has been implemented for you ØQ3: alpha-beta pruning with arbitrary depth and multiple MIN players (ghosts) 25

Recap ØMinimax search with limited depth evaluation function ØAlpha-beta pruning ØProject 1 due midnight today ØProject 2 due in two weeks 26