Two Decades of Property Testing Madhu Sudan Harvard

Similar documents
Two Decades of Property Testing

Property Testing and Affine Invariance Part I Madhu Sudan Harvard University

Invariance in Property Testing

Low-Degree Testing. Madhu Sudan MSR. Survey based on many works. of /02/2015 CMSA: Low-degree Testing 1

Locality in Coding Theory

Algebraic Codes and Invariance

Local Decoding and Testing Polynomials over Grids

Low-Degree Polynomials

Testing Affine-Invariant Properties

General Strong Polarization

General Strong Polarization

General Strong Polarization

General Strong Polarization

Uncertain Compression & Graph Coloring. Madhu Sudan Harvard

Lecture 5: Derandomization (Part II)

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Reed-Muller testing and approximating small set expansion & hypergraph coloring

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

Grover s algorithm. We want to find aa. Search in an unordered database. QC oracle (as usual) Usual trick

On the efficient approximability of constraint satisfaction problems

Learning and Fourier Analysis

Testing by Implicit Learning

A Characterization of Locally Testable Affine-Invariant Properties via Decomposition Theorems

Math 171 Spring 2017 Final Exam. Problem Worth

With P. Berman and S. Raskhodnikova (STOC 14+). Grigory Yaroslavtsev Warren Center for Network and Data Sciences

Lecture No. 1 Introduction to Method of Weighted Residuals. Solve the differential equation L (u) = p(x) in V where L is a differential operator

Towards the Sliding Scale Conjecture (Old & New PCP constructions)

Terms of Use. Copyright Embark on the Journey

Gravitation. Chapter 8 of Essential University Physics, Richard Wolfson, 3 rd Edition

Earthmover resilience & testing in ordered structures

Algebraic Property Testing: The Role of Invariance

Classical RSA algorithm

Building Assignment Testers DRAFT

Lecture 11. Kernel Methods

Fully Understanding the Hashing Trick

Active Testing! Liu Yang! Joint work with Nina Balcan,! Eric Blais and Avrim Blum! Liu Yang 2011

Yang-Hwan Ahn Based on arxiv:

Homomorphism Testing

A Posteriori Error Estimates For Discontinuous Galerkin Methods Using Non-polynomial Basis Functions

Lesson 1: Successive Differences in Polynomials

Testing Problems with Sub-Learning Sample Complexity

Learning and Fourier Analysis

1. The graph of a function f is given above. Answer the question: a. Find the value(s) of x where f is not differentiable. Ans: x = 4, x = 3, x = 2,

Lecture Notes on Linearity (Group Homomorphism) Testing

Property Testing: A Learning Theory Perspective

Approximating MAX-E3LIN is NP-Hard

A Combinatorial Characterization of the Testable Graph Properties: It s All About Regularity

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

10.1 Three Dimensional Space

A Characterization of the (natural) Graph Properties Testable with One-Sided Error

Higher-order Fourier Analysis over Finite Fields, and Applications. Pooya Hatami

The domain and range of lines is always R Graphed Examples:

Approximate Second Order Algorithms. Seo Taek Kong, Nithin Tangellamudi, Zhikai Guo

2.4 Error Analysis for Iterative Methods

Specialist Mathematics 2019 v1.2

Almost transparent short proofs for NP R

Big Bang Planck Era. This theory: cosmological model of the universe that is best supported by several aspects of scientific evidence and observation

Control of Mobile Robots

Proclaiming Dictators and Juntas or Testing Boolean Formulae

Testing Linear-Invariant Non-Linear Properties

Lecture 13: 04/23/2014

Lecture 8 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Lesson 24: True and False Number Sentences

F.4 Solving Polynomial Equations and Applications of Factoring

Eureka Lessons for 6th Grade Unit FIVE ~ Equations & Inequalities

Lecture: Expanders, in theory and in practice (2 of 2)

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Local list-decoding and testing of random linear codes from high-error

7.3 The Jacobi and Gauss-Seidel Iterative Methods

Probabilistically Checkable Proofs

A Combinatorial Characterization of the Testable Graph Properties: It s All About Regularity

SAMPLE COURSE OUTLINE MATHEMATICS METHODS ATAR YEAR 11

Lower Bounds for Testing Bipartiteness in Dense Graphs

Essential facts about NP-completeness:

Lecture 6. Notes on Linear Algebra. Perceptron

Combining the cycle index and the Tutte polynomial?

Tolerant Versus Intolerant Testing for Boolean Properties

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Sub-Constant Error Low Degree Test of Almost Linear Size

CSE 291: Fourier analysis Chapter 2: Social choice theory

Testing Low-Degree Polynomials over GF (2)

Sublinear Algorithms Lecture 3

REPRESENTATIONS OF U(N) CLASSIFICATION BY HIGHEST WEIGHTS NOTES FOR MATH 261, FALL 2001

Section 5: Quadratic Equations and Functions Part 1

Tolerant Versus Intolerant Testing for Boolean Properties

Groups and Symmetries

CS259C, Final Paper: Discrete Log, CDH, and DDH

Approximating Submodular Functions. Nick Harvey University of British Columbia

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Hardness of Approximation

Eclipses and Forces. Jan 21, ) Review 2) Eclipses 3) Kepler s Laws 4) Newton s Laws

Mathematics, Proofs and Computation

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Fourier analysis of boolean functions in quantum computation

STAT2201 Slava Vaisman

10.4 The Cross Product

Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism

Loss Functions, Decision Theory, and Linear Models

Lower Bounds for Testing Triangle-freeness in Boolean Functions

Charge carrier density in metals and semiconductors

Transcription:

Two Decades of Property Testing Madhu Sudan Harvard April 8, 2016 Two Decades of Property Testing 1 of 29

Kepler s Big Data Problem Tycho Brahe (~1550-1600): Wished to measure planetary motion accurately. To confirm sun revolved around earth (+ other planets around sun) Spent 10% of Danish GNP Johannes Kepler (~1575-1625s): Believed Copernicus s picture: planets in circular orbits. Addendum: Ratio of orbits based on Löwner-John ratios of platonic solids. Stole Brahe s data (1601). Worked on it for nine years. Disproved Addendum; Confirmed Copernicus (circle -> ellipse); discovered laws of planetary motion. Nine Years? Source: Michael Fowler, Galileo & Einstein, U. Virginia April 8, 2016 Two Decades of Property Testing 2 of 29

The challenge of analyzing big data Standard method: Propose concept class. LEARN (parameters of) best fitting concept in class to data in hand. TEST to see if this is a good enough fit. Bottleneck LEARNing is expensive; wasted if TEST rejects. Can we TEST before we LEARN? Don t be Ridiculous! Yes: This is PROPERTY TESTING!! April 8, 2016 Two Decades of Property Testing 3 of 29

Property Testing Sublinear time algorithms: Algorithms running in time o(input), o(output). Probabilistic. Correct on (approximation to) input. Random access to input, output implicit. Property testing: Restriction of sublinear time algorithms to decision problems (output = YES/NO). What decision problem? concept within class that fits data? Does data have Property? Amazing fact: Many non-trivial algorithms exist! April 8, 2016 Two Decades of Property Testing 4 of 29

Example 1: Polling Is the majority of the population Red/Blue CC = αα>.5 CC αα ; CC αα = {populations with αα fraction Red} Can Test for αα.5 by random sampling. Accept w.h.p. if αα.5 Reject w.h.p. if αα <.5 εε Sample size / Θ 1 εε 2 Independent of size of population Other similar examples: (basic statistical parameters; averages, quantiles, variance ) April 8, 2016 Two Decades of Property Testing 5 of 29

Example 2: Linearity Can test for homomorphisms: Given: ff: GG HH (GG, HH finite groups), is ff essentially a homomorphism? Test: Pick xx, yy GG uniformly, ind. at random; Verify ff xx ff(yy) = ff(xx yy) Completeness: accepts homomorphisms w.p. 1 (Obvious) Soundness: Rejects ff w.p prob. Proportional to its distance (margin) from homomorphisms. (Not obvious, [BlumLubyRubinfeld 90]) April 8, 2016 Two Decades of Property Testing 6 of 29

Linearity Analysis Given ff: GG HH that usually passes test, pretend it is close to a homomorphism gg: GG HH. Locally decode gg xx, gg xx ff xx. rr ff rr 1 for random rr GG Prove: 1. gg is close to ff. (Easy) 2. gg is a homomorphism. (Challenging) Why should ff xx. rr ff rr 1 = ff xx. ss ff ss 1? [Requires some algebraic reasoning.] Note: New elements in analysis! April 8, 2016 Two Decades of Property Testing 7 of 29

A subtle change Compare: ff usually satisfies ff xx. yy = ff xx ff yy. Population has close to 50% Reds. Vs. ff is close to gg that always satisfies gg xx. yy = ff xx gg yy Population is close to one with exactly 50% Reds. Notions same for Polling; not Homomorphisms. Latter notion is generalizable to any property! Notion of choice in Property Testing April 8, 2016 Two Decades of Property Testing 8 of 29

Key step of BLR analysis Lemma: If Pr ff xx ff yy ff(xx yy) = εε then xx,yy xx, Pr ff xx. rr ff rr 1 ff xx. ss ff ss 1 2εε rr,ss Proof (when GG, HH abelian) Pr rr,ss Pr rr,ss ff xx rr ff ss ff(xx rr ss) = εε ff xx ss ff rr ff xx ss rr = εε By union bound, w.p. 1 2εε we have ff xx rr ss = ff xx rr ff ss = ff xx ss ff rr ff xx rr ff rr 1 = ff xx ss ff ss 1 April 8, 2016 Two Decades of Property Testing 9 of 29

History (slightly abbreviated) [Blum,Luby,Rubinfeld S 90] Linearity + application to program testing [Babai,Fortnow,Lund F 90] Multilinearity + application to PCPs (MIP). [Rubinfeld+S.] Low-degree testing + Definition [Goldreich,Goldwasser,Ron] Graph property testing + systematic study Since then many developments More graph properties, statistical properties, matrix properties, properties of Boolean functions More algebraic properties April 8, 2016 Two Decades of Property Testing 10 of 29

Example 3: ΔΔ-free-ness Given graph GG, is it free of triangles? Test: Pick vertices uu, vv, ww at random. Accept if uu, vv, ww don t form a triangle Analysis: [Alon-Shapira] Use Szemeredi s regularity lemma. Can partition any graph into OO εε (1) parts. Between each part edges random. If some three well-connected partitions form triangle; then many triangles, else close to triangle-free April 8, 2016 Two Decades of Property Testing 11 of 29

Example 4: Long code/junta testing Given ff: 0,1 nn {0,1} does it depend on few coordinates. [BGS, Håstad, FKRSS Blais] Motivation: data = genome; ff represents some disease; Junta-testing: Disease caused by few features? Testing before learning? Fuzzy Test: [KKMO, MOO] Analysis: Pick xx UU 0,1 nn and yy εε-noisy copy of xx. Accept iff ff xx = ff yy ; Repeat If ff function of very few variables Accept w.h.p. If ff depends on many variables Reject w.p. 1 2. Techniques: Fourier analysis, Influence of variables, hypercontractivity April 8, 2016 Two Decades of Property Testing 12 of 29

Example 5: Distribution Testing Given samples from unknown distribution PP on nn Determine if HH PP kk [Batu et al.,valiant,valiant 2 ]: #samples needed = Θ( n log n )! Techniques: Multivariate Central Limit Theorem Stein s method Hermite polynomials April 8, 2016 Two Decades of Property Testing 13 of 29

What is Property Testing? Algebra Graphs + Regularity Matrices + Linear algebra Statistics + CLT April 8, 2016 Two Decades of Property Testing 14 of 29

(Dense) Graph Property Testing Theorem [AlonFischerNewmanShapira]: Graph property PP is OO(1)-query testable PP is determined by regularity. Suggested by [Goldreich,Goldwasser,Ron] In particular implies all hereditary properties are testable [Alon Shapira] Nice characterization of testability? Uniform test for all graph properties. Single unifying analysis e.g. Δ-freeness & 3-colorability April 8, 2016 Two Decades of Property Testing 15 of 29

Contrast with Low-degree testing Given ff: FF nn qq FF qq ; Is deg ff dd? Roughly, BLR deals with dd = 1; dd qq/2: [Rubinfeld+S. 92]: Test: deg ff llllllll dd for random llllllll? Makes OO(qq) queries. Why no unification? Analysis a la BLR; many changes dd qq = 2: [AlonKaufmanKrivelevichLitsynRon 03] Test: deg(ff AA ) dd for subspace AA ; dim AA = dd + 1? Analysis a la BLR, RS; many changes dd, qq arbitrary: [KaufmanRon 04] Analysis a la... April 8, 2016 Two Decades of Property Testing 16 of 29

Aside: Importance of Low-degree Testing Central element in PCPs. Till [Dinur 06] no proof without (robust) low-degree testing. Since: Best proofs (smallest, tightest parameters etc.) rely (in/)directly on improvements to low-degree tests. Connected to Gowers Norms: [Viola-Wigderson 07]: [AKKLR] Hardness Amplification Yield Locally Testable Codes Best in high-rate regime. [BarakGopalanHåstadMekaRaghavendraSteurer 12]: [BKSSZ 11] Small-set expanders. April 8, 2016 Two Decades of Property Testing 17 of 29

Some (introspective) questions What is qualitatively novel about linearity testing relative to classical statistics? Why are the mathematical underpinnings of different themes so different? Why is there no analog of graph property testing (broad class of properties, totally classified wrt testability) in algebraic world? What is the context for low-degree testing? Answer to all: Invariance! April 8, 2016 Two Decades of Property Testing 18 of 29

Invariance? Property PP ff: DD RR Property PP invariant under 1-1 ππ: DD DD, if ff PP ff ππ PP Property PP invariant under group GG if ππ GG PP is invariant under ππ. (Maximal) GG is invariance class of PP. Main Observation: Different property tests unified/separated by invariance class. April 8, 2016 Two Decades of Property Testing 19 of 29

Invariances (contd.) Some examples: Classical statistics: Invariant under all permutations. Graph properties: Invariant under vertex renaming. Boolean properties: Invariant under variable renaming. Matrix properties: Invariant under mult. by invertible matrix. Algebraic Properties =? Answers to (introspective) questions. Classical statistics only dealt with SS DD Different invariances different techniques. Context for algebra? April 8, 2016 Two Decades of Property Testing 20 of 29

What is Property Testing? Algebra=? SS VVVVVVVVVVVVVVVV SS vvvvvvvvvvvvvvvvvv? SS DD April 8, 2016 Two Decades of Property Testing 21 of 29

Abstracting algebraic properties [Kaufman+S. 08] Affine Invariance: Domain = Big field (FF qq nn) or vector space over small field (FF qq nn ). Property invariant under affine transformations of domain (x A.x + b) Linearity of Properties: Range = small field (FF qq ) Property = vector space over range. April 8, 2016 Two Decades of Property Testing 22 of 29

Testing Linear Properties R is a field F; P is linear! Universe: {f:d R} P Must accept P Don t care Must reject Algebraic Property = Code! (usually) April 8, 2016 Two Decades of Property Testing 23 of 29

Why study affine-invariance? Common abstraction of properties studied in [BLR], [RS], [ALMSS], [AKKLR], [KR], [KL], [JPRZ]. (Variations on low-degree polynomials) Hopes Unify existing proofs Classify/characterize testability Find new testable codes (w. novel parameters) Rest of the talk: Brief summary of findings April 8, 2016 Two Decades of Property Testing 24 of 29

Results 1: AKKLR Conjecture PP kk-locally testable PP satisfies kk-local constraint AKKLR Conjecture: kk-local constraint + symmetry (2-transitive invariance) PP kk -locally testable. Theorem [Kaufman+S. 08]: PP {ff: FF QQ nn FF qq } has kk-local constraint kk kk, QQ -locally testable. Notion of single-orbit Unification! Structure of affine-invariant properties. Theorem [Grigorescu,Kaufman,S.08]: PP FF 2 nn FF 2 with 8-local constraint that is not log log log nn-ldpc. Thm[BMSS 11]: OO(1)-LDPC that is not OO 1 -LTC. April 8, 2016 Two Decades of Property Testing 25 of 29

Results 2: Accidental +ve [Bhattacharyya,Kopparty,Schoenebeck,S.,Zuckerman 10]: Goal: Test low-degree polynomials over FF 2. Hope: Use known better tests from the 90s. Result: New technique + stronger result: [AKKLR] natural test rejects Ω(1)-far f ns w.p. Ω 2 dd. Ours: same test rejects Ω(2 dd )-far w.p. Ω(1). [Ron-Zewi,S 12]: Better query complexity for low-degree testing, when dd > qq 2 ; qq = 2tt. When dd < qq/2; qq-queries suffice. When qq 2 < dd < qq; known tests made qq2 -queries. Our result: OO(qq)-queries suffice. Techniques: single-orbit, structure of affine-invariance Non-linear affine-invariant properties April 8, 2016 Two Decades of Property Testing 26 of 29

Results 3: Lifting An annoying way to construct locally constrained properties: Define base property BB {ff: FF qq FF qq }. nn-lifted property = ff: FF qq nn FF qq ff llllllll BB llllllll Annoying: Violated simple, natural conjecture on characterization of OO(1)-query testability. Bad News + Bad News = Good News! They are single-orbit ; so testable. But on the positive side gave negative result after complicated usage and analysis. [Friedl,S 95]:If qq 2 < dd < qq, ff: FF qq nn FF qq ; deg(ff) > dd; such that on every llllllll, deg ff llllllll dd. (Reason for accidental result 2 on last slide.) April 8, 2016 Two Decades of Property Testing 27 of 29

Result 3: Lifting (contd.) [Guo,Kopparty,S. 13] Take any base property and lift it: Inherits rel. distance of base property. Testable by [Kaufman+S. 08]. Rate =?; Needs adhoc analysis. Base property = deg. dd poly with qq 2 Code has much higher rate < dd < qq: Rate 1 for constant dimensional lifts, as dd qq 1. Gives only known codes of rate 1 that are simultaneously sub-linearly locally testable and decodable. April 8, 2016 Two Decades of Property Testing 28 of 29

Conclusions Returning to bigger picture: Invariance explains the diversity in property testing. Different invariance classes different techniques. Same invariance class same techniques? Need to investigate: Properties of real-valued functions! Properties invariant (only) under variable renaming (a la junta-testing). Invariances of inference problems? Queries vs. samples? April 8, 2016 Two Decades of Property Testing 29 of 29

Thank You April 8, 2016 Two Decades of Property Testing 30 of 29