Parallelism and Machine Models

Similar documents
Register machines L2 18

Definition: Alternating time and space Game Semantics: State of machine determines who

Complexity: Some examples

Models of Computation

CSE200: Computability and complexity Space Complexity

Definition: Alternating time and space Game Semantics: State of machine determines who

Lecture 8: Complete Problems for Other Complexity Classes

Space Complexity. The space complexity of a program is how much memory it uses.

The Polynomial Hierarchy

Foundations of Query Languages

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

Circuits. Lecture 11 Uniform Circuit Complexity

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits

Complete problems for classes in PH, The Polynomial-Time Hierarchy (PH) oracle is like a subroutine, or function in

CSCI 1590 Intro to Computational Complexity

Space is a computation resource. Unlike time it can be reused. Computational Complexity, by Fu Yuxi Space Complexity 1 / 44

P = k T IME(n k ) Now, do all decidable languages belong to P? Let s consider a couple of languages:

DRAFT. Algebraic computation models. Chapter 14

The space complexity of a standard Turing machine. The space complexity of a nondeterministic Turing machine

Advanced topic: Space complexity

Lecture 11: Proofs, Games, and Alternation

Outline. Complexity Theory. Example. Sketch of a log-space TM for palindromes. Log-space computations. Example VU , SS 2018

SOLUTION: SOLUTION: SOLUTION:

Polynomial Hierarchy

Lecture 21: Algebraic Computation Models

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Lecture 1: Course Overview and Turing machine complexity

CSCI 1590 Intro to Computational Complexity

1 PSPACE-Completeness

COMPLEXITY THEORY. Lecture 17: The Polynomial Hierarchy. TU Dresden, 19th Dec Markus Krötzsch Knowledge-Based Systems

CS151 Complexity Theory. Lecture 1 April 3, 2017

Complexity Theory 112. Space Complexity

Polynomial Time Computation. Topics in Logic and Complexity Handout 2. Nondeterministic Polynomial Time. Succinct Certificates.

CSE 555 HW 5 SAMPLE SOLUTION. Question 1.

More Complexity. Klaus Sutner Carnegie Mellon University. Fall 2017

CS256 Applied Theory of Computation

Complexity. Complexity Theory Lecture 3. Decidability and Complexity. Complexity Classes

CS601 DTIME and DSPACE Lecture 5. Time and Space functions: t,s : N N +

Lecture 9: PSPACE. PSPACE = DSPACE[n O(1) ] = NSPACE[n O(1) ] = ATIME[n O(1) ]

Umans Complexity Theory Lectures

Lecture Notes Each circuit agrees with M on inputs of length equal to its index, i.e. n, x {0, 1} n, C n (x) = M(x).

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Logarithmic space. Evgenij Thorstensen V18. Evgenij Thorstensen Logarithmic space V18 1 / 18

Undecidable Problems. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science May 12, / 65

1 Deterministic Turing Machines

Notes on Computer Theory Last updated: November, Circuits

Review of Basic Computational Complexity

1 Circuit Complexity. CS 6743 Lecture 15 1 Fall Definitions

COMPUTATIONAL COMPLEXITY

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Lecture 22: PSPACE

PRAM lower bounds. 1 Overview. 2 Definitions. 3 Monotone Circuit Value Problem

Introduction to Computational Complexity

MTAT Complexity Theory October 13th-14th, Lecture 6

Harvard CS 121 and CSCI E-121 Lecture 20: Polynomial Time

INAPPROX APPROX PTAS. FPTAS Knapsack P

Complexity Theory. Knowledge Representation and Reasoning. November 2, 2005

Automata Theory CS S-12 Turing Machine Modifications

Space and Nondeterminism

Lecture 8. MINDNF = {(φ, k) φ is a CNF expression and DNF expression ψ s.t. ψ k and ψ is equivalent to φ}

CS151 Complexity Theory

Computational Models Lecture 11, Spring 2009

De Morgan s a Laws. De Morgan s laws say that. (φ 1 φ 2 ) = φ 1 φ 2, (φ 1 φ 2 ) = φ 1 φ 2.

Theory of Computation CS3102 Spring 2015 A tale of computers, math, problem solving, life, love and tragic death

CSE 135: Introduction to Theory of Computation NP-completeness

CSCI 1590 Intro to Computational Complexity

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

Quantum Computing Lecture 8. Quantum Automata and Complexity

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 9/6/2004. Notes for Lecture 3

7.1 The Origin of Computer Science

The purpose here is to classify computational problems according to their complexity. For that purpose we need first to agree on a computational

CSCE 551 Final Exam, April 28, 2016 Answer Key

Catalytic computation

A Complexity Theory for Parallel Computation

Lecture 4 : Quest for Structure in Counting Problems

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Notes on Space-Bounded Complexity

Computation Complexity

COMPLEXITY THEORY. PSPACE = SPACE(n k ) k N. NPSPACE = NSPACE(n k ) 10/30/2012. Space Complexity: Savitch's Theorem and PSPACE- Completeness

Computer Sciences Department

Lecture 20: conp and Friends, Oracles in Complexity Theory

NP-Completeness and Boolean Satisfiability

CS4026 Formal Models of Computation

Chapter 5 The Witness Reduction Technique

CISC 4090 Theory of Computation

Part I: Definitions and Properties

Mm7 Intro to distributed computing (jmp) Mm8 Backtracking, 2-player games, genetic algorithms (hps) Mm9 Complex Problems in Network Planning (JMP)

Universal Turing Machine. Lecture 20

CS20a: NP completeness. NP-complete definition. Related properties. Cook's Theorem

Unit 1A: Computational Complexity

Theory of Computation Time Complexity

Theory of Computation Lecture Notes. Problems and Algorithms. Class Information

Lecture 22: Counting

Theory of Computation. Ch.8 Space Complexity. wherein all branches of its computation halt on all

-bit integers are all in ThC. Th The following problems are complete for PSPACE NPSPACE ATIME QSAT, GEOGRAPHY, SUCCINCT REACH.

Intractable Problems [HMU06,Chp.10a]

Lecture 6: Oracle TMs, Diagonalization Limits, Space Complexity

Lecture 3: Reductions and Completeness

DRAFT. Diagonalization. Chapter 4

Transcription:

Parallelism and Machine Models Andrew D Smith University of New Brunswick, Fredericton Faculty of Computer Science

Overview Part 1: The Parallel Computation Thesis Part 2: Parallelism of Arithmetic RAMs

Part 1 The Parallel Computation Thesis

The Random Access Machine (RAM) Cook & Reckhow (1974) More realistic model of existing computers Loses the sequential access of Turing machines Keeps certain properties important to complexity theory Memory consists of an infinite sequence of registers and each register is capable of holding an arbitrary integer Associated with the machine is a cost function which assigns a cost (time required) to each operation Unless otherwise stated all operations are assumed uniform cost

The Random Access Machine (RAM) Cook & Reckhow (1974) More realistic model of existing computers Loses the sequential access of Turing machines Keeps certain properties important to complexity theory Memory consists of an infinite sequence of registers and each register is capable of holding an arbitrary integer Associated with the machine is a cost function which assigns a cost (time required) to each operation Unless otherwise stated all operations are assumed uniform cost

The original instruction set X i c, c Z X i X Xj X Xi X j IF X i > 0 GOTO m READ X i WRITE X i Literal assignment (always unit cost) Indirect addressing (central to Random Access) Conditional transfer (required for Turing equivalence) Input and Output operations X i X j + X k X i X j X k Addition and Subtraction Benefits of Random Access Random Access allows indirect addressing, essential for abstract data types such as list and trees The RAM is also a cognitive aid, and permits an algorithm designer to visualize the manipulation of abstract structures

Relating Turing Machines and RAMs If some Turing machine recognizes A within time T (n) n, then some RAM (with arbitrary cost l), recognizes A in time T (n)l(n) If a set A is recognized by a logarithmic cost RAM within time T (n) > n, then some multitape Turing Machine recognizes A within time T (n) 2 If a set A is recognized by uniform cost RAM within time T (n) > n, then some multitape Turing Machine recognizes A within time T (n) 3 In a nutshell: RAMs are safe to use in computational complexity

The Parallel Random Access Machine Fortune & Wyllie (1978) Introduced to model computation by multiple RAMs operating simultaneously on the same data with the goal of solving the same problem A PRAM has an unbounded set of processors P 0, P 1,... Each is a RAM as defined by Cook & Reckhow Slight Modification: P j has an accumulator A j all ops except new STORE X i operate on accumulators A PRAM has an unbounded global memory X 0, X 1,... Each P j has a local memory also (details not important)

Start of computation on a PRAM Input placed in input registers; all memory cleared Input length placed in A 0 and P 0 is started If P i executes FORK(x), then: 1. Local memory for first inactive processor P j is cleared 2. Accumulator of P j is given value in the accumulator of P i 3. P j starts running at label x of the finite program Another new instruction Number of Processors PRAM algorithms often assume some number of processors True PRAM model only requires processors be countable Their number restricted by what can be started Conflicts {CREW, EREW, CREW} Concurrent reads allowed Irrelevant Concurrent read/write write goes first Concurrent write causes immediate halt in rejecting state No consideration of implementation details (e.g. communication cost)

Start of computation on a PRAM Input placed in input registers; all memory cleared Input length placed in A 0 and P 0 is started If P i executes FORK(x), then: 1. Local memory for first inactive processor P j is cleared 2. Accumulator of P j is given value in the accumulator of P i 3. P j starts running at label x of the finite program Another new instruction Number of Processors PRAM algorithms often assume some number of processors True PRAM model only requires processors be countable Their number restricted by what can be started Conflicts {CREW, EREW, CREW} Concurrent reads allowed Irrelevant Concurrent read/write write goes first Concurrent write causes immediate halt in rejecting state No consideration of implementation details (e.g. communication cost)

Start of computation on a PRAM Input placed in input registers; all memory cleared Input length placed in A 0 and P 0 is started If P i executes FORK(x), then: 1. Local memory for first inactive processor P j is cleared 2. Accumulator of P j is given value in the accumulator of P i 3. P j starts running at label x of the finite program Another new instruction Number of Processors PRAM algorithms often assume some number of processors True PRAM model only requires processors be countable Their number restricted by what can be started Conflicts {CREW, EREW, CREW} Concurrent reads allowed Irrelevant Concurrent read/write write goes first Concurrent write causes immediate halt in rejecting state No consideration of implementation details (e.g. communication cost)

Start of computation on a PRAM Input placed in input registers; all memory cleared Input length placed in A 0 and P 0 is started If P i executes FORK(x), then: 1. Local memory for first inactive processor P j is cleared 2. Accumulator of P j is given value in the accumulator of P i 3. P j starts running at label x of the finite program Another new instruction Number of Processors PRAM algorithms often assume some number of processors True PRAM model only requires processors be countable Their number restricted by what can be started Conflicts {CREW, EREW, CREW} Concurrent reads allowed Irrelevant Concurrent read/write write goes first Concurrent write causes immediate halt in rejecting state No consideration of implementation details (e.g. communication cost)

Start of computation on a PRAM Input placed in input registers; all memory cleared Input length placed in A 0 and P 0 is started If P i executes FORK(x), then: 1. Local memory for first inactive processor P j is cleared 2. Accumulator of P j is given value in the accumulator of P i 3. P j starts running at label x of the finite program Another new instruction Number of Processors PRAM algorithms often assume some number of processors True PRAM model only requires processors be countable Their number restricted by what can be started Conflicts {CREW, EREW, CREW} Concurrent reads allowed Irrelevant Concurrent read/write write goes first Concurrent write causes immediate halt in rejecting state No consideration of implementation details (e.g. communication cost)

k PRAM-TIME(T k (n)) = k TM-SPACE(T k (n)) The most fundamental result in parallel complexity theory Connects parallel complexity to the world of P =? NP Two Consequences PRAM-LOGTIME = LOGSPACE Related to AC and NC hierarchies Well known result PRAM-PTIME = PSPACE Write conflict irrelevant mod P Of Interest Here Some Remarks The result seems to rely on communicating exponential amounts of information through global storage. Two that don t: The class of sets accepted by nondeterministic polynomial time bounded, polynomial global storage bounded PRAMs is identically PSPACE The class of sets accepted by deterministic polynomial time, polynomial space PRAMs contains the class co-np

k PRAM-TIME(T k (n)) = k TM-SPACE(T k (n)) The most fundamental result in parallel complexity theory Connects parallel complexity to the world of P =? NP Two Consequences PRAM-LOGTIME = LOGSPACE Related to AC and NC hierarchies Well known result PRAM-PTIME = PSPACE Write conflict irrelevant mod P Of Interest Here Some Remarks The result seems to rely on communicating exponential amounts of information through global storage. Two that don t: The class of sets accepted by nondeterministic polynomial time bounded, polynomial global storage bounded PRAMs is identically PSPACE The class of sets accepted by deterministic polynomial time, polynomial space PRAMs contains the class co-np

k PRAM-TIME(T k (n)) = k TM-SPACE(T k (n)) The most fundamental result in parallel complexity theory Connects parallel complexity to the world of P =? NP Two Consequences PRAM-LOGTIME = LOGSPACE Related to AC and NC hierarchies Well known result PRAM-PTIME = PSPACE Write conflict irrelevant mod P Of Interest Here Some Remarks The result seems to rely on communicating exponential amounts of information through global storage. Two that don t: The class of sets accepted by nondeterministic polynomial time bounded, polynomial global storage bounded PRAMs is identically PSPACE The class of sets accepted by deterministic polynomial time, polynomial space PRAMs contains the class co-np

PSPACE Simulates in polynomial time Proved with a generic transformation from PSPACE bounded (TM, input) pair to a graph problem easily solved on a PRAM PRAM

k PRAM-TIME(T k (n)) k TM-SPACE(T k (n)) How it works: Transform Turing Machine Acceptance into Graph Reachability M is a Turing Machine, x is the input, S is the tape used Identify states of M with nodes of digraph G(x, M, S) Identify transitions of M with arcs of G(x, M, S) Counting States: S O(n c ) M has 2 O(nc ) states If M is deterministic then G(x, M, S) has out-degree 1 (a rooted forest, arcs toward roots) (node s initial configuration), (node t final configuration) If t is reachable from s, then the TM halts in an accepting state on x Transitive Closure: solved with Boolean matrix multiplication of adjacency matrix Boolean Matrix Multiplication AC 0, Transitive Closure AC 1 NC 2 Graph has exponential number of nodes NC is really polynomial

k PRAM-TIME(T k (n)) k TM-SPACE(T k (n)) How it works: Transform Turing Machine Acceptance into Graph Reachability M is a Turing Machine, x is the input, S is the tape used Identify states of M with nodes of digraph G(x, M, S) Identify transitions of M with arcs of G(x, M, S) Counting States: S O(n c ) M has 2 O(nc ) states If M is deterministic then G(x, M, S) has out-degree 1 (a rooted forest, arcs toward roots) (node s initial configuration), (node t final configuration) If t is reachable from s, then the TM halts in an accepting state on x Transitive Closure: solved with Boolean matrix multiplication of adjacency matrix Boolean Matrix Multiplication AC 0, Transitive Closure AC 1 NC 2 Graph has exponential number of nodes NC is really polynomial

Transitive Closure Given a directed graph G and two nodes s and t, determine whether there is a path in G that starts at s and ends at t. 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 Boolean Matrix Multiplication Adjacency Matrix (I + A) (I + A) 2 (I + (I + A) 2 ) 2 0 B @ 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A Possibly the first use of pointer jumping!

Transitive Closure Given a directed graph G and two nodes s and t, determine whether there is a path in G that starts at s and ends at t. 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 Boolean Matrix Multiplication Adjacency Matrix (I + A) (I + A) 2 (I + (I + A) 2 ) 2 0 B @ 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A Possibly the first use of pointer jumping!

Transitive Closure Given a directed graph G and two nodes s and t, determine whether there is a path in G that starts at s and ends at t. 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 Boolean Matrix Multiplication Adjacency Matrix (I + A) (I + A) 2 (I + (I + A) 2 ) 2 0 B @ 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A 0 B @ 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 C A Possibly the first use of pointer jumping!

Now for the other direction... PRAM Must design a PSPACE bounded TM that can simulate an arbitrary PRAM program Obtained by reducing PSPACE TM acceptance to graph reachability PSPACE

k PRAM-TIME(T k (n)) k TM-SPACE(T k (n)) Mutually Recursive Procedures Verify the instruction executed, and the contents of memory and accumulators at each time VERIFY_ACCUMULATOR(P j,t,c) Guess: time t < t, instruction i { Aj last modified at time t Verify P j executed i at time t i executed at time t produces c VERIFY_MEMORY(X,t,c) Guess: processor P j, time t < t { Pj stored to X at time t Verify A j contained c at time t no store to X between t and t VERIFY_INSTRUCTION(P j,t,i) Guess: processor P j, instruction i { Pj executed i at time t 1 Verify if i i 1 then i jumps to i Think: Tracing adjacency list represented digraph backward Must guess origin of each arc!

k PRAM-TIME(T k (n)) k TM-SPACE(T k (n)) 1. Construct non-deterministic PSPACE bounded TM: To determine if the PRAM accepts, the TM must VERIFY 2 things: 1. P 0 halts at time t with A 0 = 1 2. No concurrent writes occurred VERIFY using mutually recursive procedures from last slide: {VERIFY_INSTRUCTION, VERIFY_ACCUMULATOR, VERIFY_MEMORY} Since the PRAM takes polynomial time, the stack will contain a poly number of calls And since the space is at most exponential, values pushed require polynomial bits 2. Eliminate all non-deterministic choices: Savitch s Theorem (1970) For any space constructible function T (n) log n, NSPACE(T (n)) SPACE(T (n) 2 )

k PRAM-TIME(T k (n)) k TM-SPACE(T k (n)) 1. Construct non-deterministic PSPACE bounded TM: To determine if the PRAM accepts, the TM must VERIFY 2 things: 1. P 0 halts at time t with A 0 = 1 2. No concurrent writes occurred VERIFY using mutually recursive procedures from last slide: {VERIFY_INSTRUCTION, VERIFY_ACCUMULATOR, VERIFY_MEMORY} Since the PRAM takes polynomial time, the stack will contain a poly number of calls And since the space is at most exponential, values pushed require polynomial bits 2. Eliminate all non-deterministic choices: Savitch s Theorem (1970) For any space constructible function T (n) log n, NSPACE(T (n)) SPACE(T (n) 2 )

The Parallel Computation Thesis Goldschlager (1982) Introduced concept to define parallelism for complexity theory Time-bounded parallel machines are polynomially related to space-bounded computers. That is, for any function T (n), k PARALLEL-TIME(T k (n)) = k SPACE(T k (n)) In theoretical computer science, a thesis is definitive Church-Turing Thesis defines Effective computation Edmonds (Efficiency) Thesis defines Efficient computation Parallel Computing Thesis defines Parallel computation The Second Machine Class Consists of those devices that satisfy the Parallel Computation Thesis with respect to the traditional, sequential Turing machine model Not required to have more than one processor!

The Parallel Computation Thesis Goldschlager (1982) Introduced concept to define parallelism for complexity theory Time-bounded parallel machines are polynomially related to space-bounded computers. That is, for any function T (n), k PARALLEL-TIME(T k (n)) = k SPACE(T k (n)) In theoretical computer science, a thesis is definitive Church-Turing Thesis defines Effective computation Edmonds (Efficiency) Thesis defines Efficient computation Parallel Computing Thesis defines Parallel computation The Second Machine Class Consists of those devices that satisfy the Parallel Computation Thesis with respect to the traditional, sequential Turing machine model Not required to have more than one processor!

The Parallel Computation Thesis Goldschlager (1982) Introduced concept to define parallelism for complexity theory Time-bounded parallel machines are polynomially related to space-bounded computers. That is, for any function T (n), k PARALLEL-TIME(T k (n)) = k SPACE(T k (n)) In theoretical computer science, a thesis is definitive Church-Turing Thesis defines Effective computation Edmonds (Efficiency) Thesis defines Efficient computation Parallel Computing Thesis defines Parallel computation The Second Machine Class Consists of those devices that satisfy the Parallel Computation Thesis with respect to the traditional, sequential Turing machine model Not required to have more than one processor!

Part 2 Parallelism of Arithmetic RAMs

Multiplication is controversial! {, } are more complex than {+, } original evidence: circuit complexity +, AC 0 but, / AC 0 (Furst, Saxe & Sipser, 1981) Hartmanis & Simon (1974) Proved RAM with {+,,,,,, } is in Second Machine Class Open Problem: relationship between RAMs with and without multiplication Bertoni, Mauri & Sabadini (1981) What is the complexity of counting solutions to QBF problems? Motivated by Counting Complexity of Valiant (#P results) Discovered that generating functions for # of solutions to a QBF can be evaluated by manipulating polynomials in a particular way Proved that Arithmetic RAMs (those with +,,, ) can do the required manipulations in polynomial time

Multiplication is controversial! {, } are more complex than {+, } original evidence: circuit complexity +, AC 0 but, / AC 0 (Furst, Saxe & Sipser, 1981) Hartmanis & Simon (1974) Proved RAM with {+,,,,,, } is in Second Machine Class Open Problem: relationship between RAMs with and without multiplication Bertoni, Mauri & Sabadini (1981) What is the complexity of counting solutions to QBF problems? Motivated by Counting Complexity of Valiant (#P results) Discovered that generating functions for # of solutions to a QBF can be evaluated by manipulating polynomials in a particular way Proved that Arithmetic RAMs (those with +,,, ) can do the required manipulations in polynomial time

Multiplication is controversial! {, } are more complex than {+, } original evidence: circuit complexity +, AC 0 but, / AC 0 (Furst, Saxe & Sipser, 1981) Hartmanis & Simon (1974) Proved RAM with {+,,,,,, } is in Second Machine Class Open Problem: relationship between RAMs with and without multiplication Bertoni, Mauri & Sabadini (1981) What is the complexity of counting solutions to QBF problems? Motivated by Counting Complexity of Valiant (#P results) Discovered that generating functions for # of solutions to a QBF can be evaluated by manipulating polynomials in a particular way Proved that Arithmetic RAMs (those with +,,, ) can do the required manipulations in polynomial time

How multiplication simulates parallelism (Blue lines indicate P-time reductions) PRAM The usual (familiar) model of parallel computation A measure of data independence Part 1 Part 2 PSPACE QBF Classical complexity class Equal to PTIME on a PRAM (as we have seen) Satisfiability of Quantified Boolean Formulas Most well known PSPACE complete problem straight line program Model associated with an algebraic structure Sequence of operations on elements of a set Arithmetical RAM A Random Access Machine with operation set {+,.,, }

Instance: Question: Quantified Boolean Formulas A formula of the form Q 1 x 1... Q n x n F (x 1,..., x n ) where each Q i {, } and F is a propositional formula in n variables Does this formula evaluate to true? QBF and PSPACE - a recursive QBF for reachability: F d (x, y) = z ( F d/2 (x, z) F d/2 (z, y) ) x y x z y A logarithmic sized recursive QBF for reachability: ((u ) ) F d (x, y) = z u v( = x v = z) (u = z v = y) Fd/2 (u, v)

Instance: Question: Quantified Boolean Formulas A formula of the form Q 1 x 1... Q n x n F (x 1,..., x n ) where each Q i {, } and F is a propositional formula in n variables Does this formula evaluate to true? QBF and PSPACE - a recursive QBF for reachability: F d (x, y) = z ( F d/2 (x, z) F d/2 (z, y) ) x y x z y A logarithmic sized recursive QBF for reachability: ((u ) ) F d (x, y) = z u v( = x v = z) (u = z v = y) Fd/2 (u, v)

Instance: Question: Quantified Boolean Formulas A formula of the form Q 1 x 1... Q n x n F (x 1,..., x n ) where each Q i {, } and F is a propositional formula in n variables Does this formula evaluate to true? QBF and PSPACE - a recursive QBF for reachability: F d (x, y) = z ( F d/2 (x, z) F d/2 (z, y) ) x y x z y A logarithmic sized recursive QBF for reachability: ((u ) ) F d (x, y) = z u v( = x v = z) (u = z v = y) Fd/2 (u, v)

PSPACE The Reachability Problem in digraphs can be expressed as a QBF of size logarithmic in the size of the graph QBF Next Show that QBFs can be solved by evaluating a straight line program straight line program

( n Polynomials ) k=0 x kz k = x 0 + x 1 z + x 2 z 2 + x 3 z 3 +... Let f = f(z) and g = g(z) be polynomials in z Addition: Multiplication: (f + g) k = f k + g k (f g) k = k j=0 f jg k j Useful operations on GFs Right Shift: ( f) k = f k+1 { (f g) k = f k g k ( h f) k = f hk Define the structure P = P, +,,,, h, the set of polynomials and the operations defined above

Straight Line Programs over P A straight-line program (SLP) of length n on the structure P is sequence of n instructions such that: The 1 st instruction is of the form: (1) p 1 z (i.e. the elementary polynomial) The k th instruction is of the form: (k) p k p i + p j pi p j pi p j p i 2 pi 1 If Π is an SLP on P of length n then Π generates the polynomial p n

α(x 1,..., x n ) is a Boolean formula χ : {x 1,..., x n } {0, 1} is an interpretation k = number with binary representation χ(x 1 )χ(x 2 ) χ(x n ) Define α k = α(χ(x 1 ),..., χ(x n )) Generating Polynomial associated with α p α (z) = 2 n 1 k=0 α k z k If α has n variables and m operations from {,, }, then p α can be generated by a SLP of length O(n + m). Proof by structural induction on α: p xj = (1 + x) (1 + x 2j 1 )x 2j (1 + x 2j +1 ) (1 + x 2n 1 ) p β γ = p β + p γ (p β p γ ) p β γ = p β p γ

α(x 1,..., x n ) is a Boolean formula χ : {x 1,..., x n } {0, 1} is an interpretation k = number with binary representation χ(x 1 )χ(x 2 ) χ(x n ) Define α k = α(χ(x 1 ),..., χ(x n )) Generating Polynomial associated with α p α (z) = 2 n 1 k=0 α k z k If α has n variables and m operations from {,, }, then p α can be generated by a SLP of length O(n + m). Proof by structural induction on α: p xj = (1 + x) (1 + x 2j 1 )x 2j (1 + x 2j +1 ) (1 + x 2n 1 ) p β γ = p β + p γ (p β p γ ) p β γ = p β p γ

ψ = Q 1 x 1... Q n x n α(x 1,..., x n ), Q j {, }, is a QBF Z j = x j {0,1} if Q j =, and Z j = x j {0,1} otherwise The number of satisfying interpretations for ψ #ψ = Z 1... Z n α(x 1,..., x n ) If ψ is a QBF of size n then #ψ can be computed by means of a length O(n) straight-line program over P p(0) = p α = k α kz k Q n j+1 = p(j) = 2 (p(j 1) ( p(j 1))), Q n j+1 = p(j) = 2 (p(j 1) + ( p(j 1)))

ψ = Q 1 x 1... Q n x n α(x 1,..., x n ), Q j {, }, is a QBF Z j = x j {0,1} if Q j =, and Z j = x j {0,1} otherwise The number of satisfying interpretations for ψ #ψ = Z 1... Z n α(x 1,..., x n ) If ψ is a QBF of size n then #ψ can be computed by means of a length O(n) straight-line program over P p(0) = p α = k α kz k Q n j+1 = p(j) = 2 (p(j 1) ( p(j 1))), Q n j+1 = p(j) = 2 (p(j 1) + ( p(j 1)))

QBF QBFs can be solved by a polynomial sized SLP over polynomials, which evaluates to the number of models for the QBF straight line program Arithmetical RAM Next Show that the operations of an Arithmetic RAM are sufficient for evaluating an SLP over P in polynomial time

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

Simulating SLPs over P with Arithmetic RAMs Addition: Multiplication: Right Shift: (f + g)(x) = f(x) + g(x) (f g)(x) = f(x) g(x) ( f)(x) = f(x) x } easy operations change variable: carries don t propagate z must be sufficiently large high order terms act as workspace mod function cleans things up ( ) (f g)(x) = f(z)g(z n+1 ) mod (z n+2 x) mod z ( ) ( h f)(x) = f(z) mod z h x mod z correctness proofs: divisibility results Conclusion: Arithmetic RAMs belong to the Second Machine Class mod function easily computed

The Standard Model For a problem instance x, the standard model is a unit cost RAM with O( x O(1) ) registers each having size O(log x ). The operations include any arithmetic or Boolean operations on numbers stored in registers. Polynomial space Log sized registers Usual Operations Acceptable Operations Usual way of thinking about space: it must be initialized Initializing exponential space requires exponential time Allows indexing a polynomial number of registers Also allows counting a polynomial number of elements Functions with operators in the C programming language Arithmetic operations: {+,,, } Boolean operations: {,, } Many operations of the form {0, 1} log n {0, 1} log n Lookup tables can be constructed usually in linear time Example: Threshold Functions, Bit Counting

The Standard Model For a problem instance x, the standard model is a unit cost RAM with O( x O(1) ) registers each having size O(log x ). The operations include any arithmetic or Boolean operations on numbers stored in registers. Polynomial space Log sized registers Usual Operations Acceptable Operations Usual way of thinking about space: it must be initialized Initializing exponential space requires exponential time Allows indexing a polynomial number of registers Also allows counting a polynomial number of elements Functions with operators in the C programming language Arithmetic operations: {+,,, } Boolean operations: {,, } Many operations of the form {0, 1} log n {0, 1} log n Lookup tables can be constructed usually in linear time Example: Threshold Functions, Bit Counting

The Standard Model For a problem instance x, the standard model is a unit cost RAM with O( x O(1) ) registers each having size O(log x ). The operations include any arithmetic or Boolean operations on numbers stored in registers. Polynomial space Log sized registers Usual Operations Acceptable Operations Usual way of thinking about space: it must be initialized Initializing exponential space requires exponential time Allows indexing a polynomial number of registers Also allows counting a polynomial number of elements Functions with operators in the C programming language Arithmetic operations: {+,,, } Boolean operations: {,, } Many operations of the form {0, 1} log n {0, 1} log n Lookup tables can be constructed usually in linear time Example: Threshold Functions, Bit Counting

The Standard Model For a problem instance x, the standard model is a unit cost RAM with O( x O(1) ) registers each having size O(log x ). The operations include any arithmetic or Boolean operations on numbers stored in registers. Polynomial space Log sized registers Usual Operations Acceptable Operations Usual way of thinking about space: it must be initialized Initializing exponential space requires exponential time Allows indexing a polynomial number of registers Also allows counting a polynomial number of elements Functions with operators in the C programming language Arithmetic operations: {+,,, } Boolean operations: {,, } Many operations of the form {0, 1} log n {0, 1} log n Lookup tables can be constructed usually in linear time Example: Threshold Functions, Bit Counting

The Standard Model For a problem instance x, the standard model is a unit cost RAM with O( x O(1) ) registers each having size O(log x ). The operations include any arithmetic or Boolean operations on numbers stored in registers. Polynomial space Log sized registers Usual Operations Acceptable Operations Usual way of thinking about space: it must be initialized Initializing exponential space requires exponential time Allows indexing a polynomial number of registers Also allows counting a polynomial number of elements Functions with operators in the C programming language Arithmetic operations: {+,,, } Boolean operations: {,, } Many operations of the form {0, 1} log n {0, 1} log n Lookup tables can be constructed usually in linear time Example: Threshold Functions, Bit Counting