Trusting the Cloud with Practical Interactive Proofs

Similar documents
Stream Computation and Arthur- Merlin Communication

CS151 Complexity Theory. Lecture 14 May 17, 2017

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University

Streaming Graph Computations with a Helpful Advisor. Justin Thaler Graham Cormode and Michael Mitzenmacher

Rational Proofs with Multiple Provers. Jing Chen, Samuel McCauley, Shikha Singh Department of Computer Science

A State of the Art MIP For Circuit Satisfiability. 1 A 2-Prover MIP for Low-Depth Arithmetic Circuit Satisfiability

Verifying Computations with Streaming Interactive Proofs

arxiv: v1 [cs.ds] 18 Sep 2015

Practical Verified Computation with Streaming Interactive Proofs

Arthur-Merlin Streaming Complexity

Interactive Proofs & Arguments, Low-Degree & Multilinear Extensions. 1 Definitions: Interactive Proofs and Argument Systems

Notes on Zero Knowledge

Verifiable Stream Computation and Arthur-Merlin Communication

Theoretical Cryptography, Lectures 18-20

Annotations for Sparse Data Streams

Computing the Entropy of a Stream

6.896 Quantum Complexity Theory 30 October Lecture 17

Competing Provers Protocols for Circuit Evaluation

Lecture 14: IP = PSPACE

MTAT Complexity Theory December 8th, Lecture 12

Multi-Party Computation with Conversion of Secret Sharing

1 Randomized Computation

Lecture 19: Interactive Proofs and the PCP Theorem

On Interactivity in Arthur-Merlin Communication and Stream Computation

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

6.842 Randomness and Computation Lecture 5

Lecture 12: Interactive Proofs

Introduction to Cryptography Lecture 13

Zero-Knowledge Against Quantum Attacks

Lecture 1: Introduction to Public key cryptography

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München

Lecture 26. Daniel Apon

Lecture Notes 20: Zero-Knowledge Proofs

Time-Optimal Interactive Proofs for Circuit Evaluation

Lecture 16: Communication Complexity

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity

1 The Low-Degree Testing Assumption

Lecture Lecture 9 October 1, 2015

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam

2 Evidence that Graph Isomorphism is not NP-complete

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University

Winter 2011 Josh Benaloh Brian LaMacchia

Lecture 18 - Secret Sharing, Visual Cryptography, Distributed Signatures

Practical Verified Computation with Streaming Interactive Proofs

Notes on Complexity Theory Last updated: November, Lecture 10

Topics in Complexity

L7. Diffie-Hellman (Key Exchange) Protocol. Rocky K. C. Chang, 5 March 2015

CPSC 467: Cryptography and Computer Security

CPSC 467b: Cryptography and Computer Security

Round-Efficient Multi-party Computation with a Dishonest Majority

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M.

PAC-learning, VC Dimension and Margin-based Bounds

CPSC 467b: Cryptography and Computer Security

Lecture Hardness of Set Cover

CS151 Complexity Theory. Lecture 13 May 15, 2017

Quantum Supremacy and its Applications

Parallel Coin-Tossing and Constant-Round Secure Two-Party Computation

III. Authentication - identification protocols

B(w, z, v 1, v 2, v 3, A(v 1 ), A(v 2 ), A(v 3 )).

Lecture 14: Secure Multiparty Computation

Nondeterministic Circuit Lower Bounds from Mildly Derandomizing Arthur-Merlin Games

JASS 06 Report Summary. Circuit Complexity. Konstantin S. Ushakov. May 14, 2006

INAPPROX APPROX PTAS. FPTAS Knapsack P

IP = PSPACE using Error Correcting Codes

Lecture 26: Arthur-Merlin Games

198:538 Complexity of Computation Lecture 16 Rutgers University, Spring March 2007

Question 1. The Chinese University of Hong Kong, Spring 2018

Basic Probabilistic Checking 3

A note on exponential circuit lower bounds from derandomizing Arthur-Merlin games

Error Detection and Correction: Small Applications of Exclusive-Or

A Threshold of ln(n) for approximating set cover

Quantum walk algorithms

2 Natural Proofs: a barrier for proving circuit lower bounds

Lecture Notes 17. Randomness: The verifier can toss coins and is allowed to err with some (small) probability if it is unlucky in its coin tosses.

The Proof of IP = P SP ACE

CPSC 467: Cryptography and Computer Security

Lecture Notes Each circuit agrees with M on inputs of length equal to its index, i.e. n, x {0, 1} n, C n (x) = M(x).

From Secure MPC to Efficient Zero-Knowledge

CS 282A/MATH 209A: Foundations of Cryptography Prof. Rafail Ostrovsky. Lecture 9

Multi-Dimensional Online Tracking

Data Streams & Communication Complexity

QUANTUM COMMUNICATIONS BASED ON QUANTUM HASHING. Alexander Vasiliev. Kazan Federal University

CS Communication Complexity: Applications and New Directions

Communication Complexity

Real Interactive Proofs for VPSPACE

6.854 Advanced Algorithms

Notes for Lecture 25

Quantum Communication Complexity

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

Algebraic Problems in Computational Complexity

Inaccessible Entropy and its Applications. 1 Review: Psedorandom Generators from One-Way Functions

14 Diffie-Hellman Key Agreement

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 11: Hash Functions, Merkle-Damgaard, Random Oracle

Verifiable Computing: Between Theory and Practice. Justin Thaler Georgetown University

MTAT Complexity Theory October 13th-14th, Lecture 6

#29: Logarithm review May 16, 2009

Secure Multi-Party Computation. Lecture 17 GMW & BGW Protocols

Lecture 22: Oct 29, Interactive proof for graph non-isomorphism

Quantum Symmetrically-Private Information Retrieval

Transcription:

Trusting the Cloud with Practical Interactive Proofs Graham Cormode G.Cormode@warwick.ac.uk Amit Chakrabarti (Dartmouth) Andrew McGregor (U Mass Amherst) Justin Thaler (Harvard/Yahoo!) Suresh Venkatasubramanian (Utah)

There are no guarantees in life From the terms of service of a certain cloud computing service... Can we obtain guarantees of correctness of the computation? Without repeating the computation? Without storing all the input?

Interactive Proofs What s the answer? 42 Prove it! 1010101001000110110101100010001 110101? 11010010001000110101010010001101 OK!

(Streaming) Interactive Proofs Two party-model: outsource to a more powerful prover Fundamental problem: how to be sure that the prover is honest? Prover provides proof of the correct answer Ensure that verifier has very low probability of being fooled Measure resources of the participants, rounds of interaction Related to communication complexity Arthur-Merlin model, and Algebrization, with additional streaming constraints Data Stream V Proof P

Starter Problem: Index 0 1 1 1 0 1 0 1 1 0 0 0 0 1258914 Fundamental (hard) problem in data streams Input is a length m binary string x followed by index y Desired output is x[y] Requires (m) space even allowing error probability Can we find a protocol to allow recovery of arbitrary bits Without having the verifier store the entire sequence?

Real problem: Nearest neighbor

Parameters m data points (m very large) Verifier V processes data using small space << m Prover P processes data using space at least m V and P have a conversation to determine the answer If P is honest, 0.99 probability that V accepts the answer If P is dishonest, 0.99 probability that V rejects the answer Measure the space used by V, P, communication used by both Data Stream Proof Space v V P Space p Communication h

Index: 1 Round Upper Bound 0 1 1 1 0 1 0 1 1 0 0 0 0 hash 1 hash 2 hash 3 0 1 0 1 Divide the bit string into blocks of H bits Verifier remembers a hash on each block After seeing index, Prover replays its block Verifier checks hash agrees, and outputs x[y] Cost: H bits of proof from the prover, V = m/h hashes So HV = O(m log m), any point on tradeoff is possible

2 Round Index Protocol Challenge line l Random point r F b Query point y Data indexed in Boolean hypercube {0,1} b Extended to hypercube F b 1. V picks r and evaluates lowdegree extension of input at r to get q 2. V sends l to P 3. P sends polynomial p which is input restricted to l 4. V checks that p (r) = q, and outputs p (y)

Streaming LDE Computation Given query point r F b, evaluate extension of input at r Initialize: z = 0 Update with impact of each data point y=(y 1, y b ) in turn. Structure of polynomial means update causes z z + i =1 b ((1-y i )(1-r i ) + y i r i ) Lagrange polynomial, can be evaluated in small space Can be computed quickly, using appropriate precomputed look-up tables

Correctness and Cost Correctness of the protocol If P is honest: V will always accept If P is dishonest: V only accepts if p (r) = q This happens with probability b/ F : can make F bigger Costs of the protocol V s space: O(b log F ) = O(log n log log n) bits P and V exchange l and p as (b + 1) values in F, so communication cost is O(log n log log n) bits Exponential improvement over one round Consequences: can do other computations via Index e.g. median What about more complex functions?

Nearest Neighbour Search Basic idea: convert NNS into an (enormous) index problem Work with input points in [n] d Assume all distances are multiples of = 1/n d Let B = {all distinct balls}; note B n 2d Convert input points to virtual set of balls from B: point x all balls such that x V processes virtual stream through index protocol For query y X, P specifies point z X, claiming z = NN(y,X) Show ball(z,0) via Index Protocol And ball(z, dist(y, z)- ) via Index Protocol Protocol allows correct demonstration of nearest neighbour Drawback: blow-up of input size costs V a lot!

Practical Proof Protocol Exploit structure of the metric space containing the points Let (,x) be the function that reports 1 iff x is in ball Goal: query the vector v[ ] = x in input (,x) (,x) has a simple circuit for common metrics (Hamming, L 1, L 2 ) Arithmetize the formula to compute distances Transform formula to polynomial via G 1 G 2 G 1 G 2 and G 1 G 2 1-(1-G 1 )(1-G 2 ) Low-degree extension of v: v (B 1 B 2d log n ) = x (B 1 B 2d log n, x) Can then apply Index protocol to v v never materialized by P or V Final costs of the protocol: Verifier can process each data point in time poly(d,log n) Communication cost and verifier space both poly(d,log m,log n) bits

Concluding Remarks These protocols are truly practical No, really, they are Also provide insight into the theory of Arthur-Merlin communication games Many open problems around this area Extend to other data mining/machine learning problems Prove lower bounds: some problems are hard Evaluations on real data, optimization of implementations Variant models: power of two provers