Kolmogorov structure functions for automatic complexity

Similar documents
The definition of Kolmogorov structure functions for nondeterministic automatic complexity

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Nondeterministic Finite Automata

Nondeterministic automatic complexity of overlap-free and almost square-free words

Turing s thesis: (1930) Any computation carried out by mechanical means can be performed by a Turing Machine

Computational Models #1

Nondeterministic Finite Automata

Nondeterminism. September 7, Nondeterminism

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

Closure under the Regular Operations

Chapter 5. Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

September 7, Formal Definition of a Nondeterministic Finite Automaton

CISC 4090 Theory of Computation

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Advanced topic: Space complexity

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

TWO-WAY FINITE AUTOMATA & PEBBLE AUTOMATA. Written by Liat Peterfreund

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Computer Sciences Department

CSE 105 THEORY OF COMPUTATION

Finite Automata and Languages

Lecture 1: Finite State Automaton

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Finite Automata Part Two

Turing Machines A Turing Machine is a 7-tuple, (Q, Σ, Γ, δ, q0, qaccept, qreject), where Q, Σ, Γ are all finite

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Griffith University 3130CIT Theory of Computation (Based on slides by Harald Søndergaard of The University of Melbourne) Turing Machines 9-0

CS243, Logic and Computation Nondeterministic finite automata

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

COM364 Automata Theory Lecture Note 2 - Nondeterminism

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Introduction to Languages and Computation

CS 455/555: Finite automata

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

CPS 220 Theory of Computation REGULAR LANGUAGES

NFA and regex. the Boolean algebra of languages. regular expressions. Informatics 1 School of Informatics, University of Edinburgh

3515ICT: Theory of Computation. Regular languages

Lecture 3: Nondeterministic Finite Automata

Chapter Five: Nondeterministic Finite Automata

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine (NDTM)

Nondeterminism and Epsilon Transitions

Constructions on Finite Automata

The complexity of automatic complexity

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Exam Computability and Complexity

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Constructions on Finite Automata

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Variations of the Turing Machine

Non-deterministic Finite Automata (NFAs)

Theory of Languages and Automata

Computability and Complexity

Subset construction. We have defined for a DFA L(A) = {x Σ ˆδ(q 0, x) F } and for A NFA. For any NFA A we can build a DFA A D such that L(A) = L(A D )

Deterministic Finite Automata (DFAs)

Deterministic Finite Automata (DFAs)

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Formal Definition of Computation. August 28, 2013

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata

Q = Set of states, IE661: Scheduling Theory (Fall 2003) Primer to Complexity Theory Satyaki Ghosh Dastidar

C2.1 Regular Grammars

Theory of Computation (I) Yijia Chen Fudan University

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Peter Wood. Department of Computer Science and Information Systems Birkbeck, University of London Automata and Formal Languages

Introduction to Formal Languages, Automata and Computability p.1/51

C2.1 Regular Grammars

Computational Models - Lecture 1 1

State Complexity of Neighbourhoods and Approximate Pattern Matching

Decision, Computation and Language

Theory of Computation Lecture 1. Dr. Nahla Belal

Sums of Palindromes: An Approach via Automata

Finite-State Machines (Automata) lecture 12

Further discussion of Turing machines

Finite Automata. Mahesh Viswanathan

1 Showing Recognizability

Turing Machines Part III

Nondeterministic Finite Automata. Nondeterminism Subset Construction

Deterministic Finite Automaton (DFA)

UNIT-III REGULAR LANGUAGES

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

CS 208: Automata Theory and Logic

Chapter 8. Turing Machine (TMs)

Turing Machines and Time Complexity

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Automata and Languages

Busch Complexity Lectures: Turing s Thesis. Costas Busch - LSU 1

Inf2A: Converting from NFAs to DFAs and Closure Properties

CS Pushdown Automata

Introduction to Automata

CSCE 551 Final Exam, Spring 2004 Answer Key

Theory of computation: initial remarks (Chapter 11)

Turing s Thesis. Fall Costas Busch - RPI!1

CS 322 D: Formal languages and automata theory

Intro to Theory of Computation

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

Transcription:

Kolmogorov structure functions for automatic complexity Bjørn Kjos-Hanssen June 16, 2015 Varieties of Algorithmic Information, University of Heidelberg Internationales Wissenschaftssentrum

History 1936: Universal Turing machine / programming language / computer 1965: Kolmogorov complexity of a string x = 0111001, say, = the length of the shortest program printing x 1973: Structure function (Kolmogorov) 2001: Automatic complexity A(x) (Shallit and Wang) (deterministic) 2013: Nondeterministic automatic complexity A N (x) (Hyde, M.A. thesis, University of Hawai i): n/2 upper bound. 2014: Structure function for automatic complexity: entropy-based upper bound.

Definition of A N, nondeterministic automatic complexity Let M be an nondeterministic finite automaton having q states. If there is exactly one path through M of length x leading to an accept state, and x is the string read along the path, then we say A N (x) q. Such an NFA is said to witness the complexity of x being no more than q.

Upper Bound Theorem 1 (Hyde 2013) Let Σ 2 be fixed and suppose x Σ n. Then A N (x) n 2 + 1.

x m+1 x 1 x 2 x m 1 x m q 1 start q 2... q m q m+1 x n x n 1 x m+3 x m+2 Figure 1: An NFA uniquely accepting x = x 1 x 2... x n, n = 2m + 1

Example 0 0 1 1 1 0 q 1 start q 2 q 3 q 4 q 5 q 6 1 0 1 0 1 Figure 2: An NFA uniquely accepting x = 01110010101, x = 11, A N (x) 6.

Background The Kolmogorov complexity of a finite word w is roughly speaking the length of the shortest description w of w in a fixed formal language. The description w can be thought of as an optimally compressed version of w. Motivated by the non-computability of Kolmogorov complexity, Shallit and Wang studied a deterministic finite automaton analogue. Definition 1 (Shallit and Wang 2001) The automatic complexity of a finite binary string x = x 1... x n is the least number A D (x) of states of a deterministic finite automaton M such that x is the only string of length n in the language accepted by M.

Deficiency Definition 2 The deficiency of a word x of length n is D n (x) = D(x) = b(n) A N (x). where b(n) = n 2 + 1 is Hyde s upper bound.

Length n P(D n > 0) 0 0.000 2 0.500 4 0.500 6 0.531 8 0.617 10 0.664 12 0.600 14 0.687 16 0.657 18 0.658 20 0.641 22 0.633 24 0.593 (a) Even lengths. Length n P(D n > 0) 1 0.000 3 0.250 5 0.250 7 0.234 9 0.207 11 0.317 13 0.295 15 0.297 17 0.342 19 0.330 21 0.303 23 0.322 25 0.283 (b) Odd lengths. Table 1: Probability of strings of having positive complexity deficiency D n, truncated to 3 decimal digits.

Definition 3 Let DEFICIENCY be the following decision problem. Given a binary word w and an integer d 0, is D(w) > d? Theorem 4 DEFICIENCY is in NP. We do not know whether DEFICIENCY is NP-complete. Can you find deficiencies? Play the Complexity Guessing Game or the Complexity Option Game.

The next table shows the number of strings of length 0 n 15 having nondeterministic automatic complexity k. (Paper contains a table for n 23.) We note that each column has a limiting value (2,6,20,58,... ).

n\k 1 2 3 4 5 6 7 8 15 2 6 20 58 226 908 8530 23018 14 2 6 20 58 244 1270 9668 5116 13 2 6 20 64 250 2076 5774 12 2 6 20 58 282 2090 1638 11 2 6 20 58 564 1398 10 2 6 20 64 588 344 9 2 6 20 78 406 8 2 6 20 130 98 7 2 6 22 98 6 2 6 26 30 5 2 6 24 4 2 6 8 3 2 6 2 2 2 1 2 0 1

Structure function Let S x = {(q, m) q-state NFA M, x L(M) Σ n, L(M) Σ n b m } where b = Σ. Then S x has the upward closure property q q, m m, (q, m) S x = (q, m ) S x.

Structure function Definition 5 (Vereshchagin, personal communication, 2014) In an alphabet Σ containing b symbols, we define h x(m) = min{k : (k, m) S x } and h x (k) = min{m : (k, m) S x }. Thus (k, m) S x iff h x(m) k iff h x (k) m.

Example 6 The string x = 01111111101111111 has an optimal explanation given by an automaton which goes to a new state when reading a 0, and stays in the same state when reading a 1. In this case h x (3) is more unusually small (the p-value associated with h x (3) is small) than other h x (k).

Definition 7 The entropy function H : [0, 1] [0, 1] is given by H(p) = p log 2 p (1 p) log 2 (1 p). A useful well-known fact: Theorem 8 For 0 k n, ( ) n log 2 = H(k/n)n + O(log n). k

Definition 9 The dual automatic structure function of a string x of length n is a function h x : [0, n] [0, n/2 + 1]. We define the asymptotic upper envelope of h by Let h (a) = lim sup max n x =n h(p) = lim sup max n x =n where [x] is the nearest integer to x. We are interested in upper bounds on h. h x([a n]), n h : [0, 1] [0, 1/2] h x ([p n]), h : [0, 1/2] [0, 1] n

Figure 3: Bounds for the automatic structure function for alphabet size b = 2.

Figure 4: Automatic structure function for alphabet size b = 2 and length 7.

Multirun complexity A multirun is a disjoint collections of runs of varying valences. For instance, in 011111023232 the most unlikely may be the two runs 11111 and 23232. This corresponds to automata with no back-tracking. Question 10 Given a word over a k-ary alphabet, can we efficiently find the multirun which has smallest p-value? There are too many sequences of possible runs to consider them all, but perhaps there is another way. Structure Function Calculator