Learning and Fourier Analysis

Similar documents
Learning and Fourier Analysis

With P. Berman and S. Raskhodnikova (STOC 14+). Grigory Yaroslavtsev Warren Center for Network and Data Sciences

Lecture 23: Analysis of Boolean Functions

Sublinear Algorithms Lecture 3

Lecture 8: Linearity and Assignment Testing

Lecture 10: Learning DNF, AC 0, Juntas. 1 Learning DNF in Almost Polynomial Time

Linear sketching for Functions over Boolean Hypercube

Learning and Testing Submodular Functions

Lecture 4: LMN Learning (Part 2)

Junta Approximations for Submodular, XOS and Self-Bounding Functions

Lecture 22. m n c (k) i,j x i x j = c (k) k=1

Lecture 5: Derandomization (Part II)

Lecture 19: Simple Applications of Fourier Analysis & Convolution. Fourier Analysis

Lecture 7: Passive Learning

Active Testing! Liu Yang! Joint work with Nina Balcan,! Eric Blais and Avrim Blum! Liu Yang 2011

Property Testing: A Learning Theory Perspective

Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism

Testing by Implicit Learning

1 Last time and today

Lecture 1: 01/22/2014

Lecture 5: February 16, 2012

Testing Properties of Boolean Functions

The Analysis of Partially Symmetric Functions

Quantum boolean functions

Introduction to Cryptography

Proclaiming Dictators and Juntas or Testing Boolean Formulae

ANALYSIS OF BOOLEAN FUNCTIONS

Notes for Lecture 15

CS 6815: Lecture 4. September 4, 2018

Testing Booleanity and the Uncertainty Principle

Lecture 3 Small bias with respect to linear tests

PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY

3 Finish learning monotone Boolean functions

Learning Pseudo-Boolean k-dnf and Submodular Functions

Settling the Query Complexity of Non-Adaptive Junta Testing

On the Fourier spectrum of symmetric Boolean functions

Analysis of Boolean Functions

Lecture 8: March 12, 2014

CS369E: Communication Complexity (for Algorithm Designers) Lecture #8: Lower Bounds in Property Testing

Lecture 9: March 26, 2014

Introduction to Machine Learning

Testing Booleanity and the Uncertainty Principle

How Low Can Approximate Degree and Quantum Query Complexity be for Total Boolean Functions?

Testing Linear-Invariant Function Isomorphism

Property Testing Bounds for Linear and Quadratic Functions via Parity Decision Trees

Two Decades of Property Testing

Higher-order Fourier Analysis and Applications

6.842 Randomness and Computation April 2, Lecture 14

Sublinear Algorithms Lecture 3. Sofya Raskhodnikova Penn State University

Testing for Concise Representations

TTIC An Introduction to the Theory of Machine Learning. Learning from noisy data, intro to SQ model

Lecture 29: Computational Learning Theory

Unconditional Lower Bounds for Learning Intersections of Halfspaces

+ ɛ Hardness for Max-E3Lin

Local list-decoding and testing of random linear codes from high-error

Lec. 11: Håstad s 3-bit PCP

Tolerant Versus Intolerant Testing for Boolean Properties

CS359G Lecture 5: Characters of Abelian Groups

Polynomial bounds for decoupling, with applica8ons. Ryan O Donnell, Yu Zhao Carnegie Mellon University

Tolerant Versus Intolerant Testing for Boolean Properties

Testing Low-Degree Polynomials over GF (2)

Analysis of Boolean Functions

Low-Degree Testing. Madhu Sudan MSR. Survey based on many works. of /02/2015 CMSA: Low-degree Testing 1

CSE 291: Fourier analysis Chapter 2: Social choice theory

CS 151 Complexity Theory Spring Solution Set 5

Lower Bounds for Testing Triangle-freeness in Boolean Functions

Agnostic Learning of Disjunctions on Symmetric Distributions

APPROXIMATION RESISTANCE AND LINEAR THRESHOLD FUNCTIONS

Foundations of Machine Learning and Data Science. Lecturer: Avrim Blum Lecture 9: October 7, 2015

Testing Affine-Invariant Properties

Sub-Constant Error Low Degree Test of Almost Linear Size

Testing Halfspaces. Ryan O Donnell Carnegie Mellon University. Rocco A. Servedio Columbia University.

Nearly Tight Bounds for Testing Function Isomorphism

Lecture 2: Basics of Harmonic Analysis. 1 Structural properties of Boolean functions (continued)

FOURIER ANALYSIS OF BOOLEAN FUNCTIONS

arxiv: v1 [cs.cc] 29 Feb 2012

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness

Learning Coverage Functions and Private Release of Marginals

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Multi-Linearity Self-Testing with Relative Error

Homomorphism Testing

Testing Computability by Width Two OBDDs

Electronic Colloquium on Computational Complexity, Report No. 17 (1999)

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 3: Boolean Functions and the Walsh Hadamard Code 1

Lecture Notes on Linearity (Group Homomorphism) Testing

An Optimal Lower Bound for Monotonicity Testing over Hypergrids

Foundation of Cryptography, Lecture 4 Pseudorandom Functions

Invariance in Property Testing

Reduced Ordered Binary Decision Diagrams

New Results for Random Walk Learning

Learning Theory. Machine Learning CSE546 Carlos Guestrin University of Washington. November 25, Carlos Guestrin

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Learning Juntas. Elchanan Mossel CS and Statistics, U.C. Berkeley Berkeley, CA. Rocco A. Servedio MIT Department of.

Quantum Property Testing

Algebraic Property Testing: The Role of Invariance

Active Tolerant Testing

Building Assignment Testers DRAFT

Quantum Property Testing

Minimization and Maximization Algorithms in Discrete Convex Analysis

Transcription:

Learning and Fourier Analysis Grigory Yaroslavtsev http://grigory.us CIS 625: Computational Learning Theory

Fourier Analysis and Learning Powerful tool for PAC-style learning under uniform distribution over 0,1 n Sometimes requires queries of the form f(x) Works for learning many classes of functions, e.g: Monotone, DNF, decision trees, low-degree polynomials Small circuits, halfspaces, k-linear, juntas (depend on small # of variables) Submodular functions (analog of convex/concave) Can be extended to product distributions over 0,1 n, i.e. D = D 1 D 2 D n where means that draws are independent

Boolean Functions Book: Analysis of Boolean Functions, Ryan O Donnell http://analysisofbooleanfunctions.org f(x 1,, x n ): 0,1 n R Notation switch: 0 1 1 1 f : 1,1 n R Example: f x 1,, x n = x 2 x 3 x n = Lin x 2, x 3, x n f x 1,, x n = x 2 x 3 x n

Fourier Expansion Example: min x 1, x 2 = 1 2 + 1 2 x 1 + 1 2 x 2 + 1 2 x 1x 2 For S,n- let character χ S (x) = i S Thm: Every function f: 1,1 R can be uniquely represented as a multilinear polynomial f x 1,, x n = f S χ S (x) S,n- f S Fourier coefficient of f on S x i

Functions = Vectors, Inner Product Functions as vectors form a vector space: f: 1,1 n R f R 2n Inner product on functions = correlation : f, g = 2 n f x g(x) x 1,1 n = E x 1,1 n f x g x For f: 1,1 n 1,1 : f, g = Pr f x = g x Pr f x g x = 1 2dist f, g where dist f, g = Pr,f x g(x)-

Orthonormal Basis: Proof Thm: Characters form an orthonormal basis in R 2n χ S, χ T = 1, if S = T χ S, χ T = 0, if S T χ S x χ T x = i S x i j T x j = i SΔT x i 2 j S T x j = i SΔT x i = χ SΔT (x) E x χ S x χ T x = E x χ SΔT x = E x i SΔT x i = i SΔT E x,x i - Since E x x i = 0 we have: E x χ S x χ T x = 0 if SΔT E x χ S x χ T x = 1 if SΔT = This proves that χ s are an orthonormal basis

Fourier Coefficients Recall linearity of dot products: f + g, h = f, h + g, h αf, g = α f, g Lemma: f S = f, χ S f, χ S = f T χ T, χ S = f T χ T, χ S = = f T χ T, χ S = f(s) T,n- T,n- T,n-

Parseval s Theorem Parseval s Thm: For any f: 1,1 n R f, f = E x 1,1 n f 2 x = f S 2 If f: 1,1 n * 1,1+ then Example: f S 2 S,n- = 1 f = min x 1, x 2 = 1 2 + 1 2 x 1 + 1 2 x 2 + 1 2 x 1x 2 S,n- S,2- f S 2 = 4 1 4 = 1

Plancharel s Theorem Plancharel s Thm: For any f, g: 1,1 R f, g = E x 1,1 n f x g x = f S g(s) Proof: f, g = f S χ S, S n T n g T χ T = = f S g T χ S, χ T = f S g(s) S,n- S,T,n- S,n-

Basic Fourier Analysis Mean E x f x = f, 1 = f( ) For f: 1,1 n * 1,1+ we have E x f x = Pr f x = 1 Pr f x = 1 2 Variance Var f E x f x E x f x Var f = E x f x E x f x 2 = E x f 2 x E x f x 2 = f 2 (S) S,n- f 2 ( ) (Parseval s Thm) = f 2 (S) S n,s

Convolution Def.: For x, y 1,1 n define x y 1,1 n : x y i = x i y i Def.: For f, g: 1,1 n R their convolution f g: 1,1 n R: f g x E y 1,1 n,f y g(x y)- Properties: 1. f g = g f 2. f g h = f g h 3. For all S n : f g S = f S g S

Convolution: Proof of Property 3 Property 3: For all S n : f g S Proof: f g S = ॱx 1,1 n f g x χ S x = f S g S = ॱx E y,f y g(x y)-χ S x (def. convolution) = ॱy f y E x,g(x y)χ S x - = ॱy f y E x,g(x y)χ S x y χ S x y χ S x - = ॱy f y χ S y E x,g(x y)χ S x y - = ॱy f y χ S y g S = ॱy f y χ S y g S = f S g S

Approximate Linearity Def: f: 0,1 n 0,1 is linear if f x y = f x f y for all x, y 0,1 n S n such that f x = i S x i for all x Def: f: 0,1 n 0,1 is approximately linear if 1. f x y = f x f y for most x, y 0,1 n 2. S n such that f x = i S x i for most x Q: Does 1. imply 2.?

Property Testing [Goldreich, Goldwasser, Ron; Rubinfeld, Sudan] Randomized Algorithm Property Tester YES Accept with probability 2 3 YES Accept with probability 2 3 ε-close Don t care NO Reject with probability 2 3 NO Reject with probability 2 3 ε-close : ε fraction has to be changed to become YES

Linearity Testing f: 0,1 n *0,1+ P = class of linear functions dist f, P = min dist(f, g) g P ε-close: dist f, P ε Linearity Tester Linear Accept with probability = 1 ε-close Nonlinear Don t care Reject with probability 2 3

Linearity Testing [Blum, Luby, Rubinfeld] BLR Test (given access to f: 0,1 n *0,1+): Choose x 0,1 n and y 0,1 n independently Accept if f x f y = f(x y) Thm: If BLR Test accepts with prob. 1 ε then f is εclose to being linear Proof: Apply notation switch 0 1, 1 1 BLR accepts if f x f y = f x y 1 2 + 1 2 f x f y f x y = 1 if f x f y = f x y 1 + 1 f x f y f x y = 0 if f x f y f x y 2 2

Linearity Testing: Analysis 1 ε = Pr BLR Accepts f 1 = E x,y 2 + 1 f x f y f x y 2 = 1 2 + 1 2 E x,y f x f y f x y = 1 2 + 1 2 E x,f x E y,f y f x y -- = 1 2 + 1 2 E x f x f f x (def. of convolution) = 1 2 + 1 2 = 1 2 + 1 2 S n S,n- f S f 3 (S) f f (S) (Plancharel s thm)

Linearity Testing: Analysis Continued 1 2ε = S n f 3 S max S n *f S + f2 S S n = max*f S + S n Recall f S = f, χ S = 1 2dist(f, χ S ) Let S =argmax*f S +: 1 2ε 1 2dist(f, χ S ) S n dist f, χ S ε

Local Correction Learning linear functions takes n queries Lem: If f is ε-close to linear function χ S then for every x one can compute χ S (x) w.p. 1 2ε as: Pick y 0,1 n Output f y f(x y) Proof: Pr f y χ S y = Pr f x y χ S x y = ε By union bound: Pr f y = χ S y, f x y = χ S x y 1 2ε Then f y f x y = χ S y χ S x y = χ S (x)

Thanks!