Find the Missing Number or Numbers: An Exposition

Similar documents
Some Review Problems for Exam 2: Solutions

Solving Quadratic & Higher Degree Equations

Solving Quadratic & Higher Degree Equations

Making Change of n cents

Math Lecture 18 Notes

What is proof? Lesson 1

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

compare to comparison and pointer based sorting, binary trees

MITOCW watch?v=vjzv6wjttnc

Partial Fractions. June 27, In this section, we will learn to integrate another class of functions: the rational functions.

Solving Quadratic & Higher Degree Equations

Clock Arithmetic. 1. If it is 9 o clock and you get out of school in 4 hours, when do you get out of school?

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

At the start of the term, we saw the following formula for computing the sum of the first n integers:

Ch. 12 Higher Degree Equations Rational Root

What if the characteristic equation has complex roots?

Polynomials; Add/Subtract

Proportional Division Exposition by William Gasarch

CS1800: Sequences & Sums. Professor Kevin Gold

Solving with Absolute Value

Winter Camp 2009 Number Theory Tips and Tricks

MATH 2710: NOTES FOR ANALYSIS

MI 4 Mathematical Induction Name. Mathematical Induction

CS Communication Complexity: Applications and New Directions

Lecture 10 Oct. 3, 2017

Mathematical Induction

Lesson 21 Not So Dramatic Quadratics

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University

Sums of Squares (FNS 195-S) Fall 2014

HAMPSHIRE COLLEGE: YOU CAN T GET THERE FROM HERE : WHY YOU CAN T TRISECT AN ANGLE, DOUBLE THE CUBE, OR SQUARE THE CIRCLE. Contents. 1.

5.2 Infinite Series Brian E. Veitch

MATH 3012 N Homework Chapter 9

Quiz 07a. Integers Modulo 12

Where do pseudo-random generators come from?

MITOCW MIT18_02SCF10Rec_61_300k

Theoretical Cryptography, Lecture 10

Generating Functions

Polynomial Operations

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

2 A 3-Person Discrete Envy-Free Protocol

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

P-adic numbers. Rich Schwartz. October 24, 2014

Welcome to Math Video Lessons. Stanley Ocken. Department of Mathematics The City College of New York Fall 2013

Inverses and Elementary Matrices

COLLEGE ALGEBRA. Paul Dawkins

Talk Science Professional Development

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Recitation 4: Quantifiers and basic proofs

CHAPTER 1. Review of Algebra

Computing the Entropy of a Stream

Lecture 6: Finite Fields

base 2 4 The EXPONENT tells you how many times to write the base as a factor. Evaluate the following expressions in standard notation.

Galois Theory Overview/Example Part 1: Extension Fields. Overview:

Complex Numbers in Trigonometry

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

Exploring Graphs of Polynomial Functions

Big-oh stuff. You should know this definition by heart and be able to give it,

1 Randomized Computation

No Solution Equations Let s look at the following equation: 2 +3=2 +7

PYTHAGOREAN TRIPLES KEITH CONRAD

Math Lecture 3 Notes

WORKSHEET ON NUMBERS, MATH 215 FALL. We start our study of numbers with the integers: N = {1, 2, 3,...}

6: Polynomials and Polynomial Functions

Squaring and Unsquaring

Errata List for Rational Points on Elliptic Curves by Joseph H. Silverman and John Tate Version 1.3a July 5, 1994; revised by JEC, 1998

Writing proofs for MATH 61CM, 61DM Week 1: basic logic, proof by contradiction, proof by induction

Languages, regular languages, finite automata

Descriptive Statistics (And a little bit on rounding and significant digits)

Math (P)Review Part I:

DEVELOPING MATH INTUITION

Math-3. Lesson 3-1 Finding Zeroes of NOT nice 3rd Degree Polynomials

Equations and Colorings: Rado s Theorem Exposition by William Gasarch

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

Lecture 2 September 4, 2014

MATH 243E Test #3 Solutions

Most General computer?

Intro to Maths for CS: Powers and Surds

THE DIVISION THEOREM IN Z AND R[T ]

MA1131 Lecture 15 (2 & 3/12/2010) 77. dx dx v + udv dx. (uv) = v du dx dx + dx dx dx

Single-Predicate Derivations

Confidence intervals

MATH CSE20 Homework 5 Due Monday November 4

Randomness. What next?

Example. How to Guess What to Prove

Problem 1. k zero bits. n bits. Block Cipher. Block Cipher. Block Cipher. Block Cipher. removed

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

Natural deduction for truth-functional logic

Disproof MAT231. Fall Transition to Higher Mathematics. MAT231 (Transition to Higher Math) Disproof Fall / 16

A Quick Algebra Review

Lecture 20: conp and Friends, Oracles in Complexity Theory

1 Maintaining a Dictionary

Eigenvalues & Eigenvectors

QUADRATICS 3.2 Breaking Symmetry: Factoring

COLLEGE ALGEBRA. Solving Equations and Inequalities. Paul Dawkins

Note-Taking Guides. How to use these documents for success

Quadratic Equations Part I

PEANO AXIOMS FOR THE NATURAL NUMBERS AND PROOFS BY INDUCTION. The Peano axioms

Section 3.7: Solving Radical Equations

Notice that we are switching from the subtraction to adding the negative of the following term

Integration of Rational Functions by Partial Fractions

Transcription:

Find the Missing Number or Numbers: An Exposition William Gasarch Univ. of MD at College Park Introduction The following is a classic problem in streaming algorithms and often the first one taught. Assume n is large and k is constant. Alice is going to say all but k of the numbers in the set {,,..., n} in some order. Bob will listen and try to discern what the k missing numbers are. If Bob s brain could easily store and access n bits then he would be able to store a bit vector and mark each number as it came in, then scan the bit vector for the k missing numbers. But what if Bob s brain can only store m n bits? This can be presented as a fun math puzzle, and for k = and even k = the answer is fun. Is it fun for k = 3? k 4? I leave that as an exercise for the reader. We present solutions for k =, k =, k = 3 and k 4. Find the Missing Number Alice is going to say all but one of the numbers in the set {,,..., n} in some order. Bob will listen and try to discern what the missing number is. Alice says the numbers x, x,..., x n. They are all distinct elements from {,..., n} but one is missing. Let y be the missing number. University of Maryland, College Park, MD 074, gasarch@cs.umd.edu

Bob can do this problem storing just O(log n) bits. As Bob hears the numbers he maintains the SUM. This takes just O(log n) bits. At the end he has i n x i.. Solution Using the Sum Note that x i = ( i) y = i n i n n(n + ) y. Bob finds the missing number is y = n(n+) i n x i. Note. n(n+) has size lg n, hence this algorithm takes space lg n. Can we do better? Yes! Realize that the final answer is between and n. Hence if we did all calculations mod n we would get the same answer (equating 0 with n). If n is odd then n(n+) 0 (mod n). Hence y is n(n + ) i n ( n(n + ) x i = i n x i ) (mod n) = i n x i (mod n). Hence Bob can compute i n x i) (mod n) which takes lg n bits. He can then subtract it from n (can this by done in lg n bits?) and get the answer, only using lg n bits. If n is even then use mod n +.. Solution Using XOR An alternative solution: View the numbers x,..., x n as lg n -bit strings. After seeing the x,..., x L Bob maintains x x x L. One can show that the final string Bob has, x x n, IS the missing number.

3 Find the Missing Two Numbers Alice is going to say all but two of the numbers in the set {,,..., n} in some order. Bob will listen and try to discern which two numbers are missing. We denote the missing numbers y, y. We give three solutions that use O(log n) bits. 3. Solution that Uses the Quadratic Formula As Bob hears the numbers he maintains the SUM and SUM OF SQUARES. At the end Bob has Since Bob can calculate and store i n x i and i n x i he can deduce i n x i i n x i s t = y + y = i n x i i n i = y + y = i n x i i n i We derive y, y from s, t as follows. y = s y t = y + (s y ) = y sy + s y sy + s t = 0 Now use the quadratic formula to find y and then y = s y to find y. Note 3. The above solution takes 3 lg n + lg n = 5 lg n space since we need to store a sum of size O(n 3 ) and a sum of size O(n ). We can do better! We can t use mod n since the quadratic might have more than roots mod n. Let p be a prime such that n p n. Do all of the above mod p works. This will take lg(n) + lg(n) lg(n) + O(). 3

3. Solution that Uses Sums of Powers This solution is identical to the one in Section 3. up until we find s, t. We then find y y as follows: s t = y y. Let s = y + y and p = y y. Form the polynomial X sx + p = X (y + y )X + y y = (X y )(X y ). Find the roots of this polynomial. We call this THE POLY-ROOTS TRICK throughout. Note that all we need is the symmetric functions y, y. More generally we will need the symmetric functions of y, y,..., y k. Note 3. Similar to the solution in Section 3., we can do all of this mod p and hence space lg(n) + O(). 3.3 Solution that uses Symmetric Functions Throughout As Bob hears the numbers he maintains the SUM and the SUM OF PRODUCTS OF PAIRS. After hearing the first L numbers he has in his head i L x i AND L i<j L x ix j. We need to show that he can actually do this. Let s L 0 (x,..., x L ) s L (x,..., x L ) s L (x,..., x L ) = ( We don t really need s 0 but it will make the notation nice.) = i L x i = i<j L x ix j For notational convenience we use s L i to mean s L i (x,..., x L ) 4

Assume Bob has s L 0, s L and s L. And then Bob sees x L. Bob wants s L 0, s L, s L. We explain all of this by expanding everything out s L 0 =. Thats easy. We rewrite this as s L = (x + + x L ) + x L = s L + x L. s L = (x + + x L ) + x L = s L + x L s L 0 since this way it will give all of he equations (and more so for the k = 3 and k 4 cases) look the same. s L = x i x j i<j L We break this sum up into two parts- those parts that use x L and those parts that do not. If a product of two x i terms uses x L then it is of the form x i x L. hence s L = i<j L x i x j + x L (x + + x L ). AH- note that i<j L x ix j + x L (x + + x L ) is s L and x + + x L is s L. So we write this as We now write all of the equations together: s L = S L + x L s L. 5

s L 0 = s L s L = s L + x L s L 0 = s L + x L s L Hence we can keep the counters s and s and update them easily, using only O(log n) space. At the end we have s n and s n. KEY: Bob can compute, before seeing any of the data: s n s n = i n x i = i n i = i<j n x ix j = i<j n ij We want to derive s (y, y ) = y + y and s (y, y ) = y y and then finish up the proof as we did in Section 3.. For notational convenience we denote s (y, y ) by s and s (y, y ) by s. Note that s n s n = s n + s = s n + s n s + s Note that s n, s n, s n, s n are known. Hence s = (y +y ) and s = y y can be determined. Now do the poly-root trick. 4 Find the Missing Three Numbers Alice is going to say all but three of the numbers in the set {,,..., n} in some order. Bob will listen and try to discern which three numbers are missing. We denote the missing numbers y, y, y 3. We give two solutions. 6

4. Solution Using Sums of Powers Bob keeps track of the sum of terms, squares of terms, and cubes of terms. By subtracting them from the known quantities n i= i, and n i= i, and n i= i3 Bob obtains: y + y + y 3 y + y + y 3 y 3 + y 3 + y 3 3 Can he use these to determine y, y, y 3? We WANT to obtain: y + y + y 3 (thats easy!) y y + y y 3 + y y 3 y y y 3 ONCE we have them we form the polynomial X 3 (y + y + y 3 )X + (y y + y y 3 + y y 3 )X y y y 3 = (X y )(X y )(X y 3 ). Find its roots. Then you have y, y.y 3. OKAY, now to find those functions of y, y, y 3. We first try an intuitive thing: (y + y + y 3 ) (y + y + y 3) This is intuitive to try since we can already see that all of the square terms will drop out and might leave us with something simple. 7

(y + y + y 3 ) (y + y + y 3) = y y + y y 3 + y y 3 = (y y + y y 3 + y y 3 ). GREAT!- that last term is twice what we want. In other words: NOW we want y y y 3. y y + y y 3 + y y 3 = (y + y + y 3 ) (y + y + y 3) The next equation is harder to motivate so I won t even try (though its a special case of Newton s identity which I will discuss when doing the general k case): y y y 3 = (y y + y y 3 + y y 3 )(y + y + y 3 ) (y + y + y 3 )(y + y + y3) + (y 3 + y 3 + y3) 3. 3 Great! Now that we have y + y + y 3, y y + y y 3 + y y 3, y y y 3 4. Solution that uses Symmetric Functions Throughout This solution is due to Y. Minsky, A. Trachtenberg, and R. Zippel []. The KEY to the solution in the last section was that we used the sums-of-powers to obtain the symmetric functions y + y + y 3, y y + y y 3 + y y 3, y y y 3. In this solution we get the symmetric functions more directly. As Bob hears the first L numbers x,..., x L he maintains: i L x i. L i<j L x ix j. L i<j<k L x ix j x k. 8

We need to show that he can actually do this. Let s L 0 (x,..., x L ) s L (x,..., x L ) s L (x,..., x L ) s L 3 (x,..., x L ) = (We have this just so the equations look nice.) = i L x i = i<j L x ix j = i<j<k L x ix j x k For notational convenience we use s L i to mean s L i (x,..., x L ) We need to show that Bob can easily get s L 0, s L, s L, s L 3 from s L 0, s L, s L, s L 3 and x L. s L 0 = so thats easy. s L = (x + + x L ) + x L = s L + x L = s L + x L s L 0. SO we now have s L in terms of stuff Bob knows. Consider s L = i<j L x ix j We separate out the pairs the involve x L from the ones that don t. The ones that don t involve x L are just s L = i<j L x ix j. The ones that DO involve x L involve just x L and some x i with i < L. Thats just Hence x x L + x x L + + x L x L = x L (x + + x L ) = x L s L SO we now have s L in terms of stuff Bob knows. s L = s L + x L s L s L 3 is similar and we leave it to the reader. To summarize we have: 9

s L 0 = s L s L s L 3 = s L + x L s L 0 = s L + x L s L = s L 3 + x L s L Hence we can keep the counters s, s, s 3 and update them easily, using only O(log n) space. At the end we have s n 3 and s n 3 and s n 3 3. KEY: Bob can compute, before seeing any of the data: s n 0 = s n s n s n 3 = i n x i = i n i = i<j n x ix j = i<j n ij = i<j<k n x ix j x k = i<j<k n ijk We want to derive s 3 (y, y, y 3 ) = y + y + y 3 and s 3 (y, y, y 3 ) = y y + y y 3 + y y 3 and s 3 3(y, y, y 3 ) = y y y 3. We can then finish up the proof using the poly-roots trick. For notational convenience we denote s 3 i (y, y ) by s 3 i. Note that For s it is easy to relate s n, s n 3, and s 3 since s n (x..., x n ) = x + + x n = (x + + x n 3 ) + (y + y + y 3 ) = s n 3 + s 3. Hence we can derive s 3 from s n and s n 3, both of which we know. Consider s n = i<j n x ix j. We break this into pieces. Some pairs use NO elements of y, y, y 3. That would be s n 3 = i<j n 3 x ix j. Some pairs use y and some element of {x,..., x n 3 }. That would be 0

y (x + + x n 3 ) = y s n 3 Some pairs use y and some element of {x,..., x n 3 }. That would be y (x + + x n 3 ) = y s n 3 Some pairs use y 3 and some element of {x,..., x n 3 }. That would be The sum of the last three cases is y 3 (x + + x n 3 ) = y 3 s n 3 (y + y + y 3 )(x + + x n 3 ) = (x + + x n 3 )(y + y + y 3 ) = s n 3 s 3 Some pairs use two elements from {y, y, y 3 }. That would be y y + y y 3 + y y 3 = s 3. If you put this all together you get: s n = s n 3 s 3 0 + s n 3 s 3 + s n 3 0 s 3. A similar equation for s n 3 can also be derived; however, we leave that for the reader. To summarize we have in total: s n s n s n 3 = s n 3 s 3 0 + s n 3 0 s 3 = s n 3 s 3 0 + s n 3 s 3 + s n 3 0 s 3 = s n 3 3 s 3 0 + s n 3 s 3 + s n 3 s 3 + s n 3 3 s 3 0

Note that s n, s n, s n 3, s n 3, s n 3, s n 3, s n 3 0, s 3 0 are known. Hence s 3, s 3, s 3 3 can all be derived. Now we can do the poly-roots trick. 5 General k We ll need the symmetric functions for both solutions. Notation 5. s L 0 (x,..., x L ) = s L (x,..., x L ) s L (x,..., x L ) s L 3 (x,..., x L ) = i L x i = i <i L x i x i = i <i <i 3 L x i x i x i3. =. s L k (x,..., x L ) = i < <i k L x i x ik We often leave out the arguments for notational convenience. We give two solutions. The arguments used here are similar to the ones used in the k = 3 case so we omit them. 5. Solution Using Sums of Powers Bob keeps track of the sums of powers, up to the kth power. At the end he has n k i= x i, n k i= x i,. n k i= xk i. From these Bob can easily derive

k i= y i, k i= y i,. k i= yk i. We find the symmetric functions FROM these. How? By using Newton s identities. We abbreviate s k m(y,..., y k ) by s k m. We abbreviate k i= yp i by p k. ms k m(y,..., y k ) = m ( ) i s n m i(y,..., y k ) k i= j= y i j ms k m = m ( ) i s n m i(y,..., y k ) k i= j= y i j 5. Solution Using Symm Functions Throughout This algorithm is essentially from a paper by Yaron Minksy, Ari Trachtenberg, Richard Zippel []. Using an argument similar to the one in Section 3.3 we can obtain: Lemma 5. s L 0 = s L s L s L 3 = s L + x L s L 0 = s L + x L s L = s L 3 + x L s L. =. s L k = s L k + x L s L k Hence if Bob has s L,..., s L, and then sees x L, you will be able to calculate s L,..., s L k. If all k of the x i are in {,..., n} then you can do all of this in space O(k log n). Therefore Bob can find s n k,..., s n k k. 3

s n k KEY: Bob can compute the following independent of the data. He will do this after he knows,..., s n k, and use them one-at-a-time below to save space. k s n s n s n 3 = i n i = i <i n ij = i <i <i 3 n x ix j x k = i <i <i 3 n ijk. =. s n k = i < <i k x i x ik = i < <i k n i i i k We seek, for all i k, s k i (y,..., y k ). We denote these s k i for notational convenience. The following are easily seen to be true: s n s n s n 3 s n i = s n k s k 0 + s k s0 n k = s n k s k 0 + s n k s k + s n k 0 s k = s n k 3 s k 0 + s n k s k + s n k s k + s n k 0 s k 3. =. = s n k i s k 0 + s n k i sk + s i n k s k + s0 n k s k i. =. s n k = s n k k s k 0 + s n k k sk + s k n k s k + s n k 0 s k k Note that Bob knows s n i, s n k i, s k 0, s n k 0. Hence Bob can use these equations to, one at a time (to save space) find s k,s k,..., s k k. Bob forms the polynomial X k s k X k + s k X k + + ( ) k s k k = (X y )(X y ) (X y k ). Find the roots. 4

References [] Y. Minsky, A. Trachtenberg, and R. Zippel. Set reconciliation with nearly optimal communication complexity. IEEE Transactions on Information Theory, 49(9):3 8, 003. 5