Computing and Communicating Functions over Sensor Networks

Computing and Communicating Functions over Sensor Networks Solmaz Torabi Dept. of Electrical and Computer Engineering Drexel University solmaz.t@drexel.edu Advisor: Dr. John M. Walsh 1/35 1

Refrences [1] A. Giridhar and P. R. Kumar, Computing and communicating functions over sensor networks, Selected Areas in Communications, IEEE Journal on, vol. 23, no. 4, pp. 755 764, 2005. 2/35 2

Outline Model of the problem Network topologies Random planar network Collocated network Classes of functions Type threshold function Type sensitive function Results 3/35 3

Summary Order of difficulty of computations Θ( 1 n ) Average, Mode, Type vector in Collocated network: Data Downloading Θ( 1 log n ) Average, Mode, Type in Random Multi-hop network Max, Min in Collocated network 1 Θ( log log n ) Max, Min in Random Multi-hop network 4/35 4

Problem model n sensor nodes ρ ij is the distance between two nodes i, j. Fusion node needs to calculate f n (x 1,..., x n ) exactly. At time t, sensor i takes a measurement x i (t) {1,.., X } No probability distribution on x i (t) Non-information theoretic formulation They adopt packet based collision model of wireless communication 5/35 5

Problem model All nodes share a common transmission range r. Receiver should be outside other transmitters interference footprints Node i can succesuty transmit packet to j if ρij r and for every other simultaneously transmitting node k, ρ kj (1 + )r Successful communication between two nodes takes place at a rate W bits/second. 6/35 6

Problem model Block coding allowed: M measurements of node 1: x 1 x 1 (1), x 1 (2),..., x 1 (M) M measurements of node 2: x 2 x 2 (1), x 2 (2),..., x 2 (M)... M measurements of node n: x n x n (1), x n (2),..., x n (M) M functions to compute: f ( x 1 (1),.., x n (1) ),..., f ( x 1 (M),..., x n (M) ) The matrix of measurements x X Mn is a n M matrix. If all M functions are computed in time T, the computational rate is R = M T 7/35 7

List of notations f n : X n Y n is the function of interest. R(f, n) is the range of f n S M,n is a scheme or strategy T (S M,n ) is the time taken by scheme S M,n worst case over all X X nm R(S M,n ) = R (n) M T (S M,n ) is the rate of the scheme. max is the supremum of rates R(S M,n ) over all schemes S M,n and block length M R max = sup S,M M T (S M,n ) G (n) is the spatial graph consists of the set of n nodes, with edges between nodes that are within a distance r of each other. 8/35 8

Outline Model of the problem Network topologies Random planar network Collocated network Classes of functions Type threshold function Type sensitive function Results 9/35 9

Network Topologies Collocated Networks: These are networks with ρ ij r for all i, j so every transmission is heard by all nodes. Random Planar Networks: The n nodes along with the sink node s are uniformly and independently distributed on a unit square Note: The common range r of all the n nodes is so chosen that, by using multihop communication, the graph is connected 10/35 10

Review Lemma: For random planar networks, if range r(n) = connected w.h.p. and d(g(n)) c log n w.h.p. 2 log n n then G (n) is Result follows from earlier paper on critical power for asymptotic connectivity. 11/35 11

A trivial upper bound The sink node can receive at most W bits/s Representing f (.) requires log R(f, n) R max (n) W log R(f, n) (1) 12/35 12

Outline Model of the problem Network topologies Random planar network Collocated network Classes of functions Type threshold function Type sensitive function Results 13/35 13

Types of functions Divisible functions Definition given a subset S = {i 1,..., i k } [n], denote by x S = [x i1,..., x ik ] A function f : X n Y n is divisible if R(f, n) is nondecreasing in n given any partition Π(S) = {S1,.., S j } of S [n], a function g Π(S) f (x S ) = g Π(S)( f (x S1 ), f (x S2 ),..., f (x Sk ) ) (2) Example max(1, 2, 3, 4, 5) = max(max(1, 2), max(3, 4), max(5)) 14/35 14

Computing divisible function over random planar network Theorem For a divisible function f, and d(g (n) ) = O(log R(f, n) ), 1 R max (n) = Θ( log R(f,n) ) Proof Tessellation of plane into square cells of side r 2 Cell Graph: Define on a set of non-empty cells as vertices Two cells are adjacent if there are two nodes within each cell which are adjacent in G n relay node Sink node relay node 15/35 15

Proof Neighboring occupied cells can communicate with each other Consider a rooted spanning tree of the cell graph Designate the cell with the sink node as the root, s Choose a relay node (u) in each cell and a parent (v) in the next cell towards the root Each cell has one relay node (picked out of possibly multiple choices) relay node Sink node relay node 16/35 16

Proof For each node u, define the descendant set D u as If u is a relay node of a c, Du is the set of all nodes that either belong to c or to descendants of c If u is the relay parent of {u 1,..., u l }, D u = {u} D u1... D ul. relay node Sink node relay node Else Du = u Locally compute and pass on along tree to root Collect data from deg(g n ) nodes within cell Collect functional value of log R(f n ) bits from child cells Pass on functional value of 17/35log R(f n ) bits to parent cell 17

Computing a divisible function Special cases Data Downloading: log R(f, n) = log X n = O(n) R max (n) = Θ( 1 n ) (3) if d(g n ) = O(n) (For any connected graph) Frequency histogram or the type-vector τ(x) = [τ 1 (x), τ 2 (x),..., τ X (x)] (4) where τ i (x) = {j : x j = i} (5) is the number of occurrences of i in x. 18/35 18

Histogram The number of type vectors of a vector of size n is the number of nonnegative integer solutions to the equation and ( n X ) X y 1 + y 2 +... + y X = n (6) ( ) n + X 1 R(f, n) = X 1 (7) ( ) n + X 1 (n + 1) X (8) X 1 1 R max (n) = Θ( log n ) if d(g n ) = O(log n) (9) if r(n) is chosen properly d(g n ) = O(log n) w.h.p 19/35 19

Symmetric Function Definition Functions which are invariant with respect to permutations of their arguments. Note f (x 1,..., x n ) = f (σ(x 1,..., x n )) for all permutations σ (10) Symmetric functions depend only on type vector τ = (τ 1, τ 2,..., τ X ) Note f n (x 1,..., x n ) = f (τ 1, τ 2,..., τ X ) (11) The data generated by a sensor is of primary importance, rather than its identity. 20/35 20

Computing a Symmetric Function An obvious strategy to compute any symmetric function is to simply communicate the entire type or frequency-histogram 1 R max = Ω( log n ) (12) Is it possible to do better? Two disjoint subclasses Type-Sensitive Functions Type-Threshold Functions 21/35 21

Type-Sensitive Functions A symmetric function is type-sensitive if 0 < γ < 1, and k Z +, and any j n [γn], given any subset {x 1,..., x j }, there are two subsets of values {y j+1,..., y n } and {z j+1,.., z n } such that f (x 1,.., x j, z j+1,.., z n ) f (x 1,..., x j, y j+1,.., y n ) (13) There is a γ such that a fraction γ of values is never enough to pin down the value of the function f n The value of a type-sensitive function cannot be determined if a large enough fraction of the arguments are un-known. Examples Mode : If more than half the x i s are unknown, the mode is undetermined Mean, Median, Majority 22/35 22

Type-Threshold Functions A symmetric function f is said to be type-threshold if θ-vector called the threshold vector, such that f (x) = f (τ(x)) = f (min(τ(x), θ)) for all x X n (14) Only want to know whether each τ i exceeds a threshold θ i Examples Max, Min: θ = [1, 1,..., 1] k th largest value: θ = [1, 1,..., 1] Mean of the k largest values: Indicator function I {x i = k for some i}, with θ = [0, 0,..., 1,..., 0, 0] 23/35 23

Collision-free strategies (CFS) in collocated Networks Without loss of generality, suppose that time is slotted, and one bit is transmitted in each slot. W = 1 Collision-free strategy is a strategy which is required to explicitly avoid collisions. consists of φ m : {0, 1} m 1 {1,..., n} (15) ψ m : X N {0, 1} m 1 {0, 1} (16) Node φ 1 transmit packet ψ 1 (x φ1 ) at time 1. Node φ 2 (ψ 1 (x φ1 )) transmit packet ψ 2 (ψ 1 (x φ1 ), x φ2 ) at time 2.... 24/35 24

Collision-free strategies (CFS) in collocated Networks The node designated to transmit at time m is fixed by the value φ m (ψ m 1, ψ m 2,..., ψ 1 ), which can be computed by all the nodes. The identity of the transmitting node is automatically known to all. The medium access problem is resolved in a distributed but collision-free fashion. The strategies described above are required to explicitly avoid collisions 25/35 25

Type-Sensitive Functions in Collocated Networks Theorem The maximum rate for computing a type-sensitive function in a collocated network, using any CFS is Θ( 1 n ), which is of the same order as communicating the entire data. 26/35 26

Type-Sensitive Functions in Collocated Networks Proof Wlog suppose X = 2 initially x g1 is in the set S 0 g1 with the cardinality S 0 g1 = 2M After first transmission,x g1 can be in one of two sets depending on whether it transmits 0 or 1 Let the transmission correspond to the larger set, call it be S 1 g1 S 1 g1 1/2 S 0 g1 After t-th transmission of node k, let x k lie in Sk t with Sk t t 1 1/2 Sk 27/35 27

Type-Sensitive Functions in Collocated Networks So at the end, uncertainty set is: S 1 S 2... S n 2 nm T Thus at least nm T places in the nm values (x 1, x 2,..., x n ) are undetermined However to compute f n (x(1), x(2),..., x(m)), at least cnm values are needed So nm T (1 c)nm SoT cnm Hence R = M/T O( 1 cn ) Thus R max (n) = O( 1 n ) for collocated case 28/35 28

Type-Threshold Functions in Collocated Networks Theorem The maximum rate for computing a nonconstant type-threshold function in a collocated network, using any CFS is Θ( 1 log n ) Proof First prove the result for the case X = 2, and the max function f (x 1, x 2,..., x n ) = max{x i : 1 i n} θ = [1, 1,..., 1] 29/35 29

Proof Achievability Goal : providing a sequence of CFS s S M,n, asymptotically achieving the rate Ω( 1 log n ) Take block length M = ln > n Let the number of 1 s in the vector X i be M i Define S i = {1 j M : X i (j) = 1, X k (j) = 0, for all k < i} Define M i = S i, for each 1 i n Note : S i s are disjoint. Note : i M i M Note : Communicating the sets S 1, S 2,..,S n to the sink suffices to reconstruct the function. 30/35 30

Proof A collision free strategy S M,n Node i compute M i, and communicate its value in log M slots. Now S i is one of the ( M j<i M j ) M i Node i communicate the identity of the set S i in log ( M j<i M j ) M i By bounding RHS T (S M,n ) = n log M + i ( ) M j<i M j log M i (17) T (S M,n ) < n log l(n + 1) + l(n + 1) log(n + 1)e (18) R = M T = Ω( 1 log n ) (19) 31/35 31

Proof Upper bound Goal: R max (n) = O( 1 log n ) Take M > 2n. Consider a set of measurement matrices: x 1 (1),x 1 (2),...,x 1 (N) x 2 (1),x 2 (2),...,x 2 (N)... x n (1),x n (2),...,x n (N) Exactly N 2n 1 Exactly N 2n 1 at most one 1 at most one 1 at most one 1 Claim: For each set of transmissions P 1, P 2,..., P T, there is unique such x in the above set that produces it. 32/35 32

Suppose not. Then there are two: x and y which produce same transmissions. They differ in some x k y k Then also produces same transmissions since node k hears the same under x k or y k and so reacts the same. But this has different Max values from x Thus Max functions are not determined from transmissions Number of such vectors x = ( M (i 1) M ) 2n i > (n 1) M So 2 T > (n 1) M So T > M log(n 1) So R = M T 1 log(n 1) Rn max = O( 1 log n ) Rn max = Θ( 1 log n ) M 2n 33/35 33

Thank You! Question? 35/35 35