arxiv: v3 [cs.lg] 3 Dec 2017

Size: px
Start display at page:

Download "arxiv: v3 [cs.lg] 3 Dec 2017"

Transcription

1 Context-Aware Generative Aversarial Privacy Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, an Ram Rajagopal arxiv: v3 [cs.lg] 3 Dec 2017 Abstract Preserving the utility of publishe atasets while simultaneously proviing provable privacy guarantees is a well-known challenge. On the one han, context-free privacy solutions, such as ifferential privacy, provie strong privacy guarantees, but often lea to a significant reuction in utility. On the other han, context-aware privacy solutions, such as information theoretic privacy, achieve an improve privacy-utility traeoff, but assume that the ata holer has access to ataset statistics. We circumvent these limitations by introucing a novel contextaware privacy framework calle generative aversarial privacy (GAP). GAP leverages recent avancements in generative aversarial networks (GANs) to allow the ata holer to learn privatization schemes from the ataset itself. Uner GAP, learning the privacy mechanism is formulate as a constraine minimax game between two players: a privatizer that sanitizes the ataset in a way that limits the risk of inference attacks on the iniviuals private variables, an an aversary that tries to infer the private variables from the sanitize ataset. To evaluate GAP s performance, we investigate two simple (yet canonical) statistical ataset moels: (a) the binary ata moel, an (b) the binary Gaussian mixture moel. For both moels, we erive game-theoretically optimal minimax privacy mechanisms, an show that the privacy mechanisms learne from ata (in a generative aversarial fashion) match the theoretically optimal ones. This emonstrates that our framework can be easily applie in practice, even in the absence of ataset statistics. Keywors- Generative Aversarial Privacy; Generative Aversarial Networks; Privatizer Network; Aversarial Network; Statistical Data Privacy; Differential Privacy; Information Theoretic Privacy; Mutual Information Privacy; Error Probability Games; Machine Learning 1 Introuction The explosion of information collection across a variety of electronic platforms is enabling the use of inferential machine learning (ML) an artificial intelligence to guie consumers through a myria of choices an ecisions in their aily lives. In this era of artificial intelligence, ata is quickly becoming the most valuable resource [25]. Inee, large scale atasets provie tremenous utility in helping researchers esign state-of-the-art machine learning algorithms that can learn from an make preictions on real life ata. Scholars an researchers are increasingly emaning access to larger atasets that allow them to learn more sophisticate moels. Unfortunately, more often than not, in aition to containing public information that can be publishe, large scale atasets also contain private information about participating iniviuals (see Figure 1). Thus, ata collection an curation organizations are reluctant to release such atasets before carefully sanitizing them, especially in light of recent public policies on ata sharing [28, 62]. To protect the privacy of iniviuals, atasets are typically anonymize before their release. This is one by stripping off personally ientifiable information (e.g., first an last name, social security number, IDs, etc.) [50, 69, 77]. Anonymization, however, oes not provie immunity against correlation an linkage attacks [36, 61]. Inee, several successful attempts to re-ientify iniviuals from anonymize atasets have been reporte in the past ten years. For instance, [61] were able to successfully e-anonymize watch histories in the Netflix Prize, a public recommener system competition. In a more recent attack, [78] showe that participants of an anonymize DNA stuy were ientifie by linking their DNA ata with the publicly available Personal Genome Project ataset. Even more recently, [30] successfully esigne re-ientification attacks on anonymize C. Huang an L. Sankar are with the School of Electrical, Computer, an Energy Engineering at Arizona State University, Tempe, AZ P. Kairouz, X. Chen, an R. Rajagopal are with the Department of Electrical Engineering at Stanfor University, Stanfor, CA Equal contributions 1

2 Original meter ata Χ Private ata Y Meter ata Meter ata 10:00, 09/06/ :30, 05/06/2011 Income Occupancy Meter ata Meter ata 10:00, 09/06/ :30, 05/06/2011 Entry (row 1) Entry (row 2) Entry (row 3) , , ,000 4 Perturbation Entry (row n) , Database D Figure 1: An example privacy preserving mechanism for smart meter ata fmri imaging atasets. Other annoymization techniques, such as generalization [11, 32, 49] an suppression [41, 68, 86], also cannot prevent an aversary from performing the sensitive linkages or recover private information from publishe atasets [31]. Aressing the shortcomings of anonymization techniques requires ata ranomization. In recent years, two ranomization-base approaches with provable statistical privacy guarantees have emerge: (a) context-free approaches that assume worst-case ataset statistics an aversaries; (b) context-aware approaches that explicitly moel the ataset statistics an aversary s capabilities. Context-free privacy. One of the most popular context-free notions of privacy is ifferential privacy (DP) [21, 22, 23]. DP, quantifie by a leakage parameter ɛ 1, restricts istinguishability between any two neighboring atasets from the publishe ata. DP provies strong, context-free theoretical guarantees against worst-case aversaries. However, training machine learning moels on ranomize ata with DP guarantees often leas to a significantly reuce utility an comes with a tremenous hit in sample complexity [18, 19, 20, 29, 37, 42, 43, 47, 64, 82, 87, 93, 94] in the esire leakage regimes. For example, learning population level histograms uner local DP suffers from a stupenous increase in sample complexity by a factor proportional to the size of the ictionary [20, 42, 43]. Context-aware privacy. Context-aware privacy notions have been so far stuie by information theorists uner the rubric of information theoretic (IT) privacy [4, 5, 6, 8, 10, 12, 13, 14, 15, 44, 45, 46, 51, 57, 65, 67, 70, 71, 72, 84, 92]. IT privacy has preominantly been quantifie by mutual information (MI) which moels how well an aversary, with access to the release ata, can refine its belief about the private features of the ata. Recently, Issa et al. introuce maximal leakage (MaxL) to quantify leakage to a strong aversary capable of guessing any function of the ataset [40]. They also showe that their aversarial moel can be generalize to encompass local DP (wherein the mechanism ensures limite istinction for any pair of entries a stronger DP notion without a neighborhoo constraint [20, 88]) [39]. When one restricts the aversary to guessing specific private features (an not all functions of these features), the resulting aversary is a maximum a posteriori (MAP) aversary that has been stuie by Asooeh et al. in [6, 7, 8, 9]. Context-aware ata perturbation techniques have also been stuie in privacy preserving clou computing [16, 17, 48]. Compare to context-free privacy notions, context-aware privacy notions achieve a better privacy-utility traeoff by incorporating the statistics of the ataset an placing reasonable restrictions on the capabilities of the aversary. However, using information theoretic quantities (such as MI) as privacy metrics requires learning the parameters of the privatization mechanism in a ata-riven fashion that involves minimizing an empirical information theoretic loss function. This task is remarkably challenging in practice [3, 33, 56, 81, 96]. Generative aversarial privacy. Given the challenges of existing privacy approaches, we take a funamentally new approach towars enabling private ata publishing with guarantees on both privacy an utility. Instea of aopting worst-case, context-free notions of ata privacy (such as ifferential privacy), we introuce a novel context-aware moel of privacy that allows the esigner to cleverly a noise where it matters. An inherent challenge in taking a contextaware privacy approach is that it requires having access to priors, such as joint istributions of public an private variables. Such information is harly ever present in practice. To overcome this issue, we take a ata-riven approach to context-aware privacy. We leverage recent avancements in generative aversarial networks (GANs) to introuce a unifie framework for context-aware privacy calle generative aversarial privacy (GAP). Uner GAP, the parameters of a generative 1 Smaller ɛ [0, ) implies smaller leakage an stronger privacy guarantees. 2

3 X, Y ˆX = g(x, Y ) Ŷ = h(g(x, Y )) Privatizer Aversary Noise Sequence Figure 2: Generative Aversarial Privacy moel, representing the privatization mechanism, are learne from the ata itself. 1.1 Our Contributions We investigate a setting where a ata holer woul like to publish a ataset D in a privacy preserving fashion. Each row in D contains both private variables (represente by Y ) an public variables (represente by X). The goal of the ata holer is to generate ˆX in a way such that: (a) ˆX is as goo of a representation of X as possible, an (b) an aversary cannot use ˆX to reliably infer Y. To this en, we present GAP, a unifie framework for context-aware privacy that inclues existing information-theoretic privacy notions. Our formulation is inspire by GANs [34, 55, 73] an error probability games [58, 59, 60, 66, 74]. It inclues two learning blocks: a privatizer, whose task is to output a sanitize version of the public variables (subject to some istortion constraints); an an aversary, whose task is to learn the private variables from the sanitize ata. The privatizer an aversary achieve their goals by competing in a constraine minimax, zero-sum game. On the one han, the privatizer (a conitional generative moel) is esigne to minimize the aversary s performance in inferring Y reliably. On the other han, the aversary (a classifier) seeks to fin the best inference strategy that maximizes its performance. This generative aversarial framework is represente in Figure 2. At the core of GAP is a loss function 2 that captures how well an aversary oes in terms of inferring the private variables. Different loss functions lea to ifferent aversarial moels. We focus our attention on two types of loss functions: (a) a 0-1 loss that leas to a maximum a posteriori probability (MAP) aversary, an (b) an empirical log-loss that leas to a minimum cross-entropy aversary. Ultimately, our goal is to show that our ata-riven approach can provie privacy guarantees against a MAP aversary. However, erivatives of a 0-1 loss function are illefine. To overcome this issue, the ML community uses the more analytically tractable log-loss function. We o the same by choosing the log-loss function as the aversary s loss function in the ata-riven framework. We show that it leas to a performance that matches the performance of game-theoretically optimal mechanisms uner a MAP aversary. We also show that GAP recovers mutual information privacy when a log-loss function is use (see Section 2.2). To showcase the power of our context-aware, ata-riven framework, we investigate two simple, albeit canonical, statistical ataset moels: (a) the binary ata moel, an (b) the binary Gaussian mixture moel. Uner the binary ata moel, both X an Y are binary. Uner the binary Gaussian mixture moel, Y is binary whereas X is conitionally Gaussian. For both moels, we erive an compare the performance of game-theoretically optimal privatization mechanisms with those that are irectly learne from ata (in a generative aversarial fashion). For the above-mentione statistical ataset moels, we present two approaches towars esigning privacy mechanisms: (i) private-ata epenent (PDD) mechanisms, where the privatizer uses both the public an private variables, an (ii) private-ata inepenent (PDI) mechanisms, where the privatizer only uses the public variables. We show that the PDD mechanisms lea to a superior privacy-utility traeoff. 1.2 Relate Work In practice, a context-free notion of privacy (such as DP) is esirable because it places no restrictions on the ataset statistics or aversary s strength. This explains why DP has been remarkably successful in the past ten years, an has been eploye in array of systems, incluing Google s Chrome browser [27] an Apple s ios [90]. Nevertheless, because of its strong context-free nature, 2 We quantify the aversary s performance via a loss function an the quality of the release ata via a istortion function. 3

4 DP has suffere from a sequence of impossibility results. These results have mae the eployment of DP with a reasonable leakage parameter practically impossible. Inee, it was recently reporte that Apple s DP implementation suffers from several limitations most notable of which is Apple s use of unacceptably large leakage parameters [79]. Context-aware privacy notions can exploit the structure an statistics of the ataset to esign mechanisms matche to both the ata an aversarial moels. In this context, informationtheoretic metrics for privacy are naturally well suite. In fact, the aversarial moel etermines the appropriate information metric: an estimating aversary that minimizes mean square error is capture by χ 2 -square measures [13], a belief refining aversary is capture by MI [71], an aversary that can make a har MAP ecision for a specific set of private features is capture by the Arimoto MI of orer [7, 9], an an aversary that can guess any function of the private features is capture by the maximal (over all istributions of the ataset for a fixe support) Sibson information of orer [39, 40]. Information-theoretic metrics, an in particular MI privacy, allow the use of Fano s inequality an its variants [85] to boun the rate of learning the private variables for a variety of learning metrics, such as error probability an minimum mean-square error (MMSE). Despite the strength of MI in proviing statistical utility as well as capturing a fairly strong aversary that involves refining beliefs, in the absence of priors on the ataset, using MI as an empirical loss function leas to computationally intractable proceures when learning the optimal parameters of the privatization mechanism from ata. Inee, training algorithms with empirical information theoretic loss functions is a challenging problem that has been explore in specific learning contexts, such as etermining ranomize encoers for the information bottleneck problem [3] an esigning eep auto-encoers using a rate-istortion paraigm [33, 81, 96]. Even in these specific contexts, variational approaches were taken to minimize/maximize a surrogate function instea of minimizing/maximizing an empirical mutual information loss function irectly [76]. In an effort to brige theory an practice, we present a general ata-riven framework to esign privacy mechanisms that can capture a range of information-theoretic privacy metrics as loss functions. We will show how our framework leas to very practical (generative aversarial) ata-riven formulations that match their corresponing theoretical formulations. In the context of publishing atasets with privacy an utility guarantees, a number of similar approaches have been recently consiere. We briefly review them an clarify how our work is ifferent. In [91], the authors consier linear privatizer an aversary moels by aing noise in irections that are orthogonal to the public features in the hope that the spaces of the public an private features are orthogonal (or nearly orthogonal). This allows the privatizer to achieve full privacy without sacrificing utility. However, this work is restrictive in the sense that it requires the public an private features to be nearly orthogonal. Furthermore, this work provies no rigorous quantification of privacy an only investigates a limite class of linear aversaries an privatizers. DP-base obfuscators for ata publishing have been consiere in [35, 54]. The author in [35] consiers a eterministic, compressive mapping of the input ata with ifferentially private noise ae either before or after the mapping. The mapping rule is etermine by a atariven methoology to esign minimax filters that allow non-malicious entities to learn some public features from the filtere ata, while preventing malicious entities from learning other private features. The approach in [54] relies on using eep auto-encoers to etermine the relevant feature space to a ifferentially private noise to, eliminating the nee to a noise to the original ata. After noise aing, the original signal is reconstructe. These novel approaches leverage minimax filters an eep auto-encoers to incorporate a notion of context-aware privacy an achieve better privacy-utility traeoffs while using DP to enforce privacy. However, DP will still incur an insurmountable utility cost since it assumes worst-case ataset statistics. Our approach captures a broaer class of ranomization-base mechanisms via a generative moel which allows the privatizer to tailor the noise to the statistics of the ataset. Our work is also closely relate to aversarial neural cryptography [1], learning censore representations [26], an privacy preserving image sharing [64], in which aversarial learning is use to learn how to protect communications by encryption or hie/remove sensitive information. Similar to these problems, our moel inclues a minimax formulation an uses aversarial neural networks to learn privatization schemes. However, in [26, 64], the authors use non-generative autoencoers to remove sensitive information, which o not have an obvious generative interpretation. Instea, we use a GANs-like approach to learn privatization schemes that prevent an aversary from inferring the private ata. Moreover, these papers consier a Lagrangian formulation for the 4

5 utility-privacy traeoff that the obfuscator computes. We go beyon these works by stuying a game-theoretic setting with constraine optimization, which provies a specific privacy guarantee for a fixe istortion. We also compare the performance of the privatization schemes learne in an aversarial fashion with the game-theoretically optimal ones. We use conitional generative moels to represent privatization schemes. Generative moels have recently receive a lot of attention in the machine learning community [34, 38, 55, 73, 75]. Ultimately, eep generative moels hol the promise of iscovering an efficiently internalizing the statistics of the target signal to be generate. State-of-the-art generative moels are traine in an aversarial fashion [34, 55]: the generate signal is fe into a iscriminator which attempts to istinguish whether the ata is real (i.e., sample from the true unerlying istribution) or synthetic (i.e., generate from a low imensional noise sequence). Training generative moels in an aversarial fashion has proven to be successful in computer vision an enable several exciting applications. Analogous to how the generator is traine in GANs, we train the privatizer in an aversarial fashion by making it compete with an attacker. 1.3 Outline The remainer of our paper is organize as follows. We formally present our GAP moel in Section 2. We also show how, as a special case, it can recover several information theoretic notions of privacy. We then stuy a simple (but canonical) binary ataset moel in Section 3. In particular, we present theoretically optimal PDD an PDI privatization schemes, an show how these schemes can be learne from ata using a generative aversarial network. In Section 4, we investigate binary Gaussian mixture ataset moels, an provie a variety of privatization schemes. We comment on their theoretical performance an show how their parameters can be learne from ata in a generative aversarial fashion. Our proofs are eferre to sections A, B, an C of the Appenix. We conclue our paper in Section 5 with a few remarks an interesting extensions. 2 Generative Aversarial Privacy Moel We consier a ataset D which contains both public an private variables for n iniviuals (see Figure 1). We represent the public variables by a ranom variable X X, an the private variables (which are typically correlate with the public variables) by a ranom variable Y Y. Each ataset entry contains a pair of public an private variables enote by (X, Y ). Instances of X an Y are enote by x an y, respectively. We assume that each entry pair (X, Y ) is istribute accoring to P (X, Y ), an is inepenent from other entry pairs in the ataset. Since the ataset entries are inepenent of each other, we restrict our attention to memoryless mechanisms: privacy mechanisms that are applie on each ata entry separately. Formally, we efine the privacy mechanism as a ranomize mapping given by g(x, Y ) : X Y X. We consier two ifferent types of privatization schemes: (a) private ata epenent (PDD) schemes, an (b) private ata inepenent (PDI) schemes. A privatization mechanism is PDD if its output is epenent on both Y an X. It is PDI if its output only epens on X. PDD mechanisms are naturally superior to PDI mechanisms. We show, in sections 3 an 4, that there is a sizeable gap in performance between these two approaches. In our propose GAP framework, the privatizer is pitte against an aversary. We moel the interactions between the privatizer an the aversary as a non-cooperative game. For a fixe g, the goal of the aversary is to reliably infer Y from g(x, Y ) using a strategy h. For a fixe aversarial strategy h, the goal of the privatizer is to esign g in a way that minimizes the aversary s capability of inferring the private variable from the perturbe ata. The optimal privacy mechanism is obtaine as an equilibrium point at which both the privatizer an the aversary can not improve their strategies by unilaterally eviating from the equilibrium point. 2.1 Formulation Given the output ˆX = g(x, Y ) of a privacy mechanism g(x, Y ), we efine Ŷ = h(g(x, Y )) to be the aversary s inference of the private variable Y from ˆX. To quantify the effect of aversarial 5

6 inference, for a given public-private pair (x, y), we moel the loss of the aversary as l(h(g(x = x, Y = y)), Y = y) : Y Y R. Therefore, the expecte loss of the aversary with respect to (w.r.t.) X an Y is efine to be L(h, g) E[l(h(g(X, Y )), Y )], (1) where the expectation is taken over P (X, Y ) an the ranomness in g an h. Intuitively, the privatizer woul like to minimize the aversary s ability to learn Y reliably from the publishe ata. This can be trivially one by releasing an ˆX inepenent of X. However, such an approach provies no utility for ata analysts who want to learn non-private variables from ˆX. To overcome this issue, we capture the loss incurre by privatizing the original ata via a istortion function (ˆx, x) : X X R, which measures how far the original ata X = x is from the privatize ata ˆX = ˆx. Thus, the average istortion uner g(x, Y ) is E[(g(X, Y ), X)], where the expectation is taken over P (X, Y ) an the ranomness in g. On the one han, the ata holer woul like to fin a privacy mechanism g that is both privacy preserving (in the sense that it is ifficult for the aversary to learn Y from ˆX) an utility preserving (in the sense that it oes not istort the original ata too much). On the other han, for a fixe choice of privacy mechanism g, the aversary woul like to fin a (potentially ranomize) function h that minimizes its expecte loss, which is equivalent to maximizing the negative of the expecte loss. To achieve these two opposing goals, we moel the problem as a constraine minimax game between the privatizer an the aversary: min max g( ) h( ) L(h, g) (2) s.t. E[(g(X, Y ), X)] D, where the constant D 0 etermines the allowable istortion for the privatizer an the expectation is taken over P (X, Y ) an the ranomness in g an h. 2.2 GAP uner Various Loss Functions The above formulation places no restrictions on the aversary. Inee, ifferent loss functions an ecision rules lea to ifferent aversarial moels. In what follows, we will iscuss a variety of loss functions uner har an soft ecision rules, an show how our GAP framework can recover several popular information theoretic privacy notions. Har Decision Rules. When the aversary aopts a har ecision rule, h(g(x, Y )) is an estimate of Y. Uner this setting, we can choose l(h(g(x, Y )), Y ) in a variety of ways. For instance, if Y is continuous, the aversary can attempt to minimize the ifference between the estimate an true private variable values. This can be achieve by consiering a square loss function l(h(g(x, Y )), Y ) = (h(g(x, Y )) Y ) 2, (3) which is known as the l 2 loss. In this case, one can verify that the aversary s optimal ecision rule is h = E[Y g(x, Y )], which is the conitional mean of Y given g(x, Y ). Furthermore, uner the aversary s optimal ecision rule, the minimax problem in (2) simplifies to min g( ) mmse(y g(x, Y )) = max mmse(y g(x, Y )), g( ) subject to the istortion constraint. Here mmse(y g(x, Y )) is the resulting minimum mean square error (MMSE) uner h = E[Y g(x, Y )]. Thus, uner the l 2 loss, GAP provies privacy guarantees against an MMSE aversary. On the other han, when Y is iscrete (e.g., age, gener, political affiliation, etc), the aversary can attempt to maximize its classification accuracy. This is achieve by consiering a 0-1 loss function [63] given by { 0 if h(g(x, Y )) = Y l(h(g(x, Y )), Y ) = 1 otherwise. (4) 6

7 In this case, one can verify that the aversary s optimal ecision rule is the maximum a posteriori probability (MAP) ecision rule: h = argmax y Y P (y g(x, Y )), with ties broken uniformly at ranom. Moreover, uner the MAP ecision rule, the minimax problem in (2) reuces to min g( ) (1 max P (y, g(x, Y ))) = min y Y max g( ) y Y P (y, g(x, Y )) 1, (5) subject to the istortion constraint. Thus, uner a 0-1 loss function, the GAP formulation provies privacy guarantees against a MAP aversary. Soft Decision Rules. Instea of a har ecision rule, we can also consier a broaer class of soft ecision rules where h(g(x, Y )) is a istribution over Y; i.e., h(g(x, Y )) = P h (y g(x, Y )) for y Y. In this context, we can analyze the performance uner a log-loss l(h(g(x, Y )), y) = log In this case, the objective of the aversary simplifies to max E[log 1 ] = H(Y g(x, Y )), h( ) P h (y g(x, Y )) 1 P h (y g(x, Y )). (6) an that the maximization is attaine at Ph (y g(x, Y )) = P (y g(x, Y )). Therefore, the optimal aversarial ecision rule is etermine by the true conitional istribution P (y g(x, Y )), which we assume is known to the ata holer in the game-theoretic setting. Thus, uner the log-loss function, the minimax optimization problem in (2) reuces to min g( ) H(Y g(x, Y )) = min I(g(X, Y ); Y ) H(Y ), g( ) subject to the istortion constraint. Thus, uner the log-loss in (6), GAP is equivalent to using MI as the privacy metric [12]. The 0-1 loss captures a strong guessing aversary; in contrast, log-loss or information-loss moels a belief refining aversary. Next, we consier a more general α-loss function [52] that allows continuous interpolation between these extremes via l(h(g(x, Y )), y) = α α 1 ( 1 P h (y g(x, Y )) 1 1 α ), (7) for any α > 1. As shown in [52], for very large α (α ), this loss approaches that of the 0-1 (MAP) aversary. As α ecreases, the convexity of the loss function encourages the estimator Ŷ to be probabilistic, as it increasingly rewars correct inferences of lesser an lesser likely outcomes (in contrast to a har ecision rule by a MAP aversary of the most likely outcome) conitione on the reveale ata. As α 1, (7) yiels the logarithmic loss, an the optimal belief PŶ is simply the posterior belief. Denoting Hα(Y a g(y, X)) as the Arimoto conitional entropy of orer α, one can verify that [52] [ α max E h( ) α 1 ( 1 P h (y g(x, Y )) 1 1 α ) ] = H a α(y g(x, Y )), which is achieve by a α-tilte conitional istribution [52] P h (y g(x, Y )) = P (y g(x, Y ))α P (y g(x, Y )) α. y Y Uner this choice of a ecision rule, the objective of the minimax optimization in (2) reuces to min g( ) Ha α(y g(x, Y )) = min g( ) Ia α(g(x, Y ); Y ) H α (Y ), (8) where I a α is the Arimoto mutual information an H α is the Rényi entropy. Note that as α 1, we recover the classical MI privacy setting an when α, we recover the 0-1 loss. 7

8 2.3 Data-riven GAP So far, we have focuse on a setting where the ata holer has access to P (X, Y ). When P (X, Y ) is known, the ata holer can simply solve the constraine minimax optimization problem in (2) (theoretical version of GAP) to obtain a privatization mechanism that woul perform best against a chosen type of aversary. In the absence of P (X, Y ), we propose a ata-riven version of GAP that allows the ata holer to learn privatization mechanisms irectly from a ataset of the form D = {(x (i), y (i) )} n i=1. Uner the ata-riven version of GAP, we represent the privacy mechanism via a conitional generative moel g(x, Y ; θ p ) parameterize by θ p. This generative moel takes (X, Y ) as inputs an outputs ˆX. In the training phase, the ata holer learns the optimal parameters θ p by competing against a computational aversary: a classifier moele by a neural network h(g(x, Y ; θ p ); θ a ) parameterize by θ a. After convergence, we evaluate the performance of the learne g(x, Y ; θp) by computing the maximal probability of inferring Y uner the MAP aversary stuie in the theoretical version of GAP. We note that in theory, the functions h an g can (in general) be arbitrary; i.e., they can capture all possible learning algorithms. However, in practice, we nee to restrict them to a rich hypothesis class. Figure 3 shows an example of the GAP moel in which the privatizer an aversary are moele as multi-layer ranomize neural networks. For a fixe h an g, we quantify the aversary s empirical loss using a continuous an ifferentiable function L EMP (θ p, θ a ) = 1 n n l(h(g(x (i), y (i) ; θ p ); θ a ), y (i) ), (9) i=1 where (x (i), y (i) ) is the i th row of D an l(h(g(x (i), y (i) ; θ p ); θ a ), y (i) ) is the aversary loss in the ata-riven context. The optimal parameters for the privatizer an aversary are the solution to min max L EMP (θ p, θ a ) (10) θ p θ a s.t. E D [(g(x, Y ; θ p ), X)] D, where the expectation is taken over the ataset D an the ranomness in g. In keeping with the now common practice in machine learning, in the ata-riven approach for GAP, one can use the empirical log-loss function [80, 95] given by (9) with l(h(g(x (i), y (i) ; θ p ); θ a ), y (i) ) = y (i) log h(g(x (i), y (i) ; θ p ); θ a ) (1 y (i) ) log(1 h(g(x (i), y (i) ; θ p ); θ a )), which leas to a minimum cross-entropy aversary. As a result, the empirical loss of the aversary is quantifie by the cross-entropy L XE (θ p, θ a ) = 1 n n y (i) log h(g(x (i), y (i) ; θ p ); θ a ) + (1 y (i) ) log(1 h(g(x (i), y (i) ; θ p ); θ a )). (11) i=1 An alternative loss that can be reaily use in this setting is the α-loss introuce in Section 2.2. In the ata-riven context, the α-loss can be written as l(h(g(x (i), y (i) ; θ p ); θ a ), y (i) ) = α ( y (i) (1 h(g(x (i), y (i) ; θ p ); θ a ) 1 1 α ) α 1 ) +(1 y (i) )(1 (1 h(g(x (i), y (i) ; θ p ); θ a )) 1 1 α ), (12) for any constant α > 1. As iscusse in Section 2.2, the α-loss captures a variety of aversarial moels an recovers both the log-loss (when α 1) an 0-1 loss (when α ). Futhermore, (12) suggests that α-leakage can be use as a surrogate (an smoother) loss function for the 0-1 loss (when α is relatively large). The minimax optimization problem in (10) is a two-player non-cooperative game between the privatizer an the aversary. The strategies of the privatizer an aversary are given by θ p an θ a, respectively. Each player chooses the strategy that optimizes its objective function w.r.t. what its opponent oes. In particular, the privatizer must expect that if it chooses θ p, the aversary will choose a θ a that maximizes the negative of its own loss function base on the choice of the privatizer. The optimal privacy mechanism is given by the equilibrium of the privatizer-aversary game. 8

9 Privatizer θ p Aversary θa Sampling Input X Y X Y Noise Input layer Hien layer Output layer Input layer Hien layer Output layer Figure 3: A multi-layer neural network moel for the privatizer an aversary In practice, we can learn the equilibrium of the game using an iterative algorithm presente in Algorithm 1. We first maximize the negative of the aversary s loss function in the inner loop to compute the parameters of h for a fixe g. Then, we minimize the privatizer s loss function, which is moele as the negative of the aversary s loss function, to compute the parameters of g for a fixe h. To avoi over-fitting an ensure convergence, we alternate between training the aversary for k epochs an training the privatizer for one epoch. This results in the aversary moving towars its optimal solution for small perturbations of the privatizer [34]. To incorporate the istortion constraint into the learning algorithm, we use the penalty metho [53] an augmente Lagrangian metho [24] to replace the constraine optimization problem by a series of unconstraine problems whose solutions asymptotically converge to the solution of the constraine problem. Uner the penalty metho, the unconstraine optimization problem is forme by aing a penalty to the objective function. The ae penalty consists of a penalty parameter ρ t multiplie by a measure of violation of the constraint. The measure of violation is non-zero when the constraint is violate an is zero if the constraint is not violate. Therefore, in Algorithm 1, the constraine optimization problem of the privatizer can be approximate by a series of unconstraine optimization problems with the loss function l(θ p, θa t+1 ) = 1 M M i=1 + ρ t max{0, l(h(g(x (i), y (i) ; θ p ); θ t+1 a ), y (i) ) (13) 1 M M (g(x (i), y (i) ; θ p ), x (i) ) D}, i=1 where ρ t is a penalty coefficient which increases with the number of iterations t. For convex optimization problems, the solution to the series of unconstraine problems will eventually converge to the solution of the original constraine problem [53]. The augmente Lagrangian metho is another approach to enforce equality constraints by penalizing the objective function whenever the constraints are not satisfie. Different from the penalty metho, the augmente Lagrangian metho combines the use of a Lagrange multiplier an a quaratic penalty term. Note that this metho is esigne for equality constraints. Therefore, we introuce a slack variable δ to convert the inequality istortion constraint into an equality constraint. Using the augmente Lagrangian metho, the constraine optimization problem of the privatizer can be replace by a series of unconstraine problems with the loss function given by l(θ p, θa t+1, δ) = 1 M M i=1 + ρ t 2 ( 1 M λ t ( 1 M l(h(g(x (i), y (i) ; θ p ); θ t+1 a ), y (i) ) (14) M (g(x (i), y (i) ; θ p ), x (i) ) + δ D) 2 i=1 M (g(x (i), y (i) ; θ p ), x (i) ) + δ D), i=1 i=1 where ρ t is a penalty coefficient which increases with the number of iterations t an λ t is upate M accoring to the rule λ t+1 = λ t ρ t ( 1 M (g(x (i), y (i) ; θ p ), x (i) ) + δ D). For convex optimization problems, the solution to the series of unconstraine problems formulate by the augmente Lagrangian metho also converges to the solution of the original constraine problem [24]. 9

10 Algorithm 1 Alternating minimax privacy preserving algorithm Input: ataset D, istortion parameter D, iteration number T Output: Optimal privatizer parameter θ p proceure Alernate Minimax(D, D, T ) Initialize θ 1 p an θ 1 a for t = 1,..., T o Ranom minibatch of M atapoints {x (1),..., x (M) } rawn from full ataset Generate {ˆx (1),..., ˆx (M) } via ˆx (i) = g(x (i), y (i) ; θ t p) Upate the aversary parameter θ t+1 a θ t+1 a = θ t a + α t θa 1 M by stochastic graient ascen for k epochs M l(h(ˆx (i) ; θ a ), y (i) ), α t > 0 i=1 Compute the escent irection θp l(θ p, θa t+1 ), where l(θ p, θa t+1 ) = 1 M M i=1 subject to 1 M M i=1 [(g(x (i), y (i) ; θ p ), x (i) )] D l(h(g(x (i), y (i) ; θ p ); θ t+1 a ), y (i) ) Perform line search along θp l(θ p, θa t+1 ) an upate Exit if solution converge θ t+1 p = θ t p α t θp l(θ p, θ t+1 a ) return θ t+1 p 2.4 Our Focus Our GAP framework is very general an can be use to capture many notions of privacy via various ecision rules an loss funcitons. In the rest of this paper, we investigate GAP uner 0-1 loss for two simple ataset moels: (a) the binary ata moel (Section 3), an (b) the binary Gaussian mixture moel (Section 4). Uner the binary ata moel, both X an Y are binary. Uner the binary Gaussian mixture moel, Y is binary whereas X is conitionally Gaussian. We use these results to valiate that the ata-riven version of GAP can iscover theoretically optimal privatization schemes. In the ata-riven approach of GAP, since P (X, Y ) is typically unknown in practice an our objective is to learn privatization schemes irectly from ata, we have to consier the empirical (ata-riven) version of (5). Such an approach immeiately hits a roablock because taking erivatives of a 0-1 loss function w.r.t. the parameters of h an g is ill-efine. To circumvent this issue, similar to the common practice in the ML literature, we use the empirical log-loss (see Equation (11)) as the loss function for the aversary. We erive game-theoretically optimal mechanisms for the 0-1 loss function, an use them as a benchmark against which we compare the performance of the ata-riven GAP mechanisms. 10

11 3 Binary Data Moel In this section, we stuy a setting where both the public an private variables are binary value ranom variables. Let p i,j enote the joint probability of (X, Y ) = (i, j), where i, j {0, 1}. To prevent an aversary from correctly inferring the private variable Y from the public variable X, the privatizer applies a ranomize mechanism on X to generate the privatize ata ˆX. Since both the original an privatize public variables are binary, the istortion between x an ˆx can be quantifie by the Hamming istortion; i.e. (ˆx, x) = 1 if ˆx x an (ˆx, x) = 0 if ˆx = x. Thus, the expecte istortion is given by E[( ˆX, X)] = P ( ˆX X). 3.1 Theoretical Approach for Binary Data Moel The aversary s objective is to correctly guess Y from ˆX. We consier a MAP aversary who has access to the joint istribution of (X, Y ) an the privacy mechanism. The privatizer s goal is to privatize X in a way that minimizes the aversary s probability of correctly inferring Y from ˆX subject to the istortion constraint. We first focus on private-ata epenent (PDD) privacy mechanisms that epen on both Y an X. We later consier private-ata inepenent (PDI) privacy mechanisms that only epen on X PDD Privacy Mechanism Let g(x, Y ) enote a PDD mechanism. Since X, Y, an ˆX are binary ranom variables, the mechanism g(x, Y ) can be represente by the conitional istribution P ( ˆX X, Y ) that maps the public an private variable pair (X, Y ) to an output ˆX given by P ( ˆX = 0 X = 0, Y = 0) = s 0,0, P ( ˆX = 0 X = 0, Y = 1) = s 0,1, P ( ˆX = 1 X = 1, Y = 0) = s 1,0, P ( ˆX = 1 X = 1, Y = 1) = s 1,1. Thus, the marginal istribution of ˆX is given by P ( ˆX = 0) = X,Y P ( ˆX = 0 X, Y )P (X, Y ) = s 0,0 p 0,0 + s 0,1 p 0,1 + (1 s 1,0 )p 1,0 + (1 s 1,1 )p 1,1, P ( ˆX = 1) = X,Y P ( ˆX = 1 X, Y )P (X, Y ) = (1 s 0,0 )p 0,0 + (1 s 0,1 )p 0,1 + s 1,0 p 1,0 + s 1,1 p 1,1. If ˆX = 0, the aversary s inference accuracy for guessing Ŷ = 1 is P (Y = 1, ˆX = 0) = X P (X, Y = 1)P ( ˆX = 0 X, Y = 1) = p 1,1 (1 s 1,1 ) + p 0,1 s 0,1, (15) an the inference accuracy for guessing Ŷ = 0 is P (Y = 0, ˆX = 0) = X P (X, Y = 0)P ( ˆX = 0 X, Y = 0) = p 1,0 (1 s 1,0 ) + p 0,0 s 0,0. (16) Let s = {s 0,0, s 0,1, s 1,0, s 1,1 }. For ˆX = 0, the MAP aversary s inference accuracy is given by P (B) (s, ˆX = 0) = max{p (Y = 1, ˆX = 0), P (Y = 0, ˆX = 0)}. (17) Similarly, if ˆX = 1, the MAP aversary s inference accuracy is given by where P (B) (s, ˆX = 1) = max{p (Y = 1, ˆX = 1), P (Y = 0, ˆX = 1)}, (18) P (Y = 1, ˆX = 1) = X P (Y = 0, ˆX = 1) = X P (X, Y = 1)P ( ˆX = 1 X, Y = 1) = p 1,1 s 1,1 + p 0,1 (1 s 0,1 ), (19) P (X, Y = 0)P ( ˆX = 1 X, Y = 0) = p 1,0 s 1,0 + p 0,0 (1 s 0,0 ). 11

12 As a result, for a fixe privacy mechanism s, the MAP aversary s inference accuracy can be written as P (B) (B) = max P (h(g(x, Y )) = Y ) = P (s, ˆX = 0) + P (B) (s, ˆX = 1). h( ) Thus, the optimal PDD privacy mechanism is etermine by solving min s s.t. P (B) (s, ˆX = 0) + P (B) (s, ˆX = 1) (20) P ( ˆX = 0, X = 1) + P ( ˆX = 1, X = 0) D s [0, 1] 4. Notice that the above constraine optimization problem is a four imensional optimization problem parameterize by p = {p 0,0, p 0,1, p 1,0, p 1,1 } an D. Interestingly, we can formulate (20) as a linear program (LP) given by min t 0 + t 1 (21) s 1,1,s 0,1,s 1,0,s 0,0,t 0,t 1 s.t. 0 s 1,1, s 0,1, s 1,0, s 0,0 1 p 1,1 (1 s 1,1 ) + p 0,1 s 0,1 t 0 p 1,0 (1 s 1,0 ) + p 0,0 s 0,0 t 0 p 1,1 s 1,1 + p 0,1 (1 s 0,1 ) t 1 p 1,0 s 1,0 + p 0,0 (1 s 0,0 ) t 1 p 1,1 (1 s 1,1 ) + p 0,1 (1 s 0,1 ) + p 1,0 (1 s 1,0 ) + p 0,0 (1 s 0,0 ) D, where t 0 an t 1 are two slack variables representing the maxima in (17) an (18), respectively. The optimal mechanism can be obtaine by numerically solving (21) using any off-the-shelf LP solver PDI Privacy Mechanism In the previous section, we consiere PDD privacy mechanisms. Although we were able to formulate the problem as a linear program with four variables, etermining a close form solution for such a highly parameterize problem is not analytically tractable. Thus, we now consier the simple (yet meaningful) class of PDI privacy mechanisms. Uner PDI privacy mechanisms, the Markov chain Y X ˆX hols. As a result, P (Y, ˆX = ˆx) can be written as P (Y, ˆX = ˆx) = X = X = X P (Y, ˆX = ˆx X)P (X) (22) P (Y X)P ( ˆX = ˆx X)P (X) (23) P (Y, X)P ( ˆX = ˆx X), (24) where the secon equality is ue to the conitional inepenence property of the Markov chain Y X ˆX. For the PDI mechanisms, the privacy mechanism g(x, Y ) can be represente by the conitional istribution P ( ˆX X). To make the problem more tractable, we focus on a slightly simpler setting in which Y = X N, where N {0, 1} is a ranom variable inepenent of X an follows a Bernoulli istribution with parameter q. In this setting, the joint istribution of (X, Y ) can be compute as P (X = 1, Y = 1) = P (Y = 1 X = 1)P (X = 1) = p(1 q), (25) P (X = 0, Y = 1) = P (Y = 1 X = 0)P (X = 0) = (1 p)q, (26) P (X = 1, Y = 0) = P (Y = 0 X = 1)P (X = 1) = pq, (27) P (X = 0, Y = 0) = P (Y = 0 X = 0)P (X = 0) = (1 p)(1 q). (28) Let s = {s 0, s 1 } in which s 0 = P ( ˆX = 0 X = 0) an s 1 = P ( ˆX = 1 X = 1). The joint 12

13 istribution of (Y, ˆX) is given by P (Y = 1, ˆX = 0) = p(1 q)(1 s 1 ) + (1 p)qs 0, P (Y = 0, ˆX = 0) = pq(1 s 1 ) + (1 p)(1 q)s 0, P (Y = 1, ˆX = 1) = p(1 q)s 1 + (1 p)q(1 s 0 ), P (Y = 0, ˆX = 1) = pqs 1 + (1 p)(1 q)(1 s 0 ). Using the above joint probabilities, for a fixe s, we can write the MAP aversary s inference accuracy as P (B) = max h( ) P (h(g(x, Y )) = Y ) = max{p (Y = 1, ˆX = 0), P (Y = 0, ˆX = 0)} (29) + max{p (Y = 1, ˆX = 1), P (Y = 0, ˆX = 1)}. Therefore, the optimal PDI privacy mechanism is given by the solution to min s s.t. P (B) (30) P ( ˆX = 0, X = 1) + P ( ˆX = 1, X = 0) D s [0, 1] 2, where the istortion in (30) is given by (1 s 0 )(1 p) + (1 s 1 )p. By (29), P (B) can be consiere as a sum of two functions, where each function is a maximum of two linear functions. Therefore, it is convex in s 0 an s 1 for ifferent values of p, q an D. Theorem 1. For fixe p, q an D, there exists infinitely many PDI privacy mechanisms that achieve the optimal privacy-utility traeoff. If q = 1 2, any privacy mechanism that satisfies {s 0, s 1 ps 1 + (1 p)s 0 1 D, s 0, s 1 [0, 1]} is optimal. If q 1 2, the optimal PDI privacy mechanism is given as follows: If 1 D > max{p, 1 p}, the optimal privacy mechanism is given by {s 0, s 1 ps 1 + (1 p)s 0 = 1 D, s 0, s 1 [0, 1]}. The aversary s accuracy of correctly guessing the private variable is { (1 2q)(1 D) + q if q < 1 2 (2q 1)(1 D) + 1 q if q > 1. (31) 2 Otherwise, the optimal privacy mechanism is given by {s 0, s 1 max{min{p, 1 p}, 1 D} ps 1 + (1 p)s 0 max{p, 1 p}, s 0, s 1 [0, 1]} an the aversary s accuracy of correctly guessing the private variable is { p(1 q) + (1 p)q if p 1 2, q < 1 2 or p 1 2, q > 1 2 pq + (1 p)(1 q) if p 1 2, q > 1 2 or p 1 2, q < 1. (32) 2 Proof sketch: The proof of Theorem 1 is provie in Appenix A. We briefly sketch the proof etails here. For the special case q = 1 2, the solution is trivial since the private variable Y is inepenent of the public variable X. Thus, the optimal solution is given by any s 0, s 1 that satisfies the istortion constraint {s 0, s 1 ps 1 + (1 p)s 0 1 D, s 0, s 1 [0, 1]}. For q 1 2, we separate the optimization problem in (30) into four subproblems base on the ecision of the aversary. We then compute the optimal privacy mechanism of the privatizer in each subproblem. Summarizing the optimal solutions to the subproblems for ifferent values of p, q an D yiels Theorem 1. Remark: Note that if 1 D > max{p, 1 p}, i.e., D < min{p, 1 p}, the privacy guarantee achieve by the optimal PDI mechanism (the MAP aversary s accuracy of correctly guessing the private variable) ecreases linearly with D. For D min{p, 1 p}, the optimal PDI mechanism achieves a constant privacy guarantee regarless of D. However, in this case, the privatizer can just use the optimal privacy mechanism with D = min{p, 1 p} to optimize privacy guarantee without further sacrificing utility. 13

14 Privatizer Network Aversary Network Input (X, Y) s 0,0 s 0,1 s 1,0 s 1,1 Sampling X θ a,0 Y = θ a,1 X + θ a,0 (1 X ) θ a,1 Noise Figure 4: Neural network structure of the privatizer an aversary for binary ata moel 3.2 Data-riven Approach for Binary Data Moel In practice, the joint istribution of (X, Y ) is often unknown to the ata holer. Instea, the ata holer has access to a ataset D, which is use to learn a goo privatization mechanism in a generative aversarial fashion. In the training phase, the ata holer learns the parameters of the conitional generative moel (representing the privatization scheme) by competing against a computational aversary represente by a neural network. The etails of both neural networks are provie later in this section. When convergence is reache, we evaluate the performance of the learne privatization scheme by computing the accuracy of inferring Y uner a strong MAP aversary that: (a) has access to the joint istribution of (X, Y ), (b) has knowlege of the learne privacy mechanism, an (c) can compute the MAP rule. Ultimately, the ata holer s hope is to learn a privatization scheme that matches the one obtaine uner the game-theoretic framework, where both the aversary an privatizer are assume to have access to P (X, Y ). To evaluate our ata-riven approach, we compare the mechanisms learne in an aversarial fashion on D with the game-theoretically optimal ones. Since the private variable Y is binary, we use the empirical log-loss function for the aversary (see Equation (11)). For a fixe θ p, the aversary learns the optimal θa by maximizing L XE (h(g(x, Y ; θ p ); θ a ), Y ) given in Equation (11). For a fixe θ a, the privatizer learns the optimal θp by minimizing L XE (h(g(x, Y ; θ p ); θ a ), Y ) subject to the istortion constraint (see Equation (10)). Since both X an Y are binary variables, we can use the privatizer parameter θ p to represent the privacy mechanism s irectly. For the aversary, we efine θ a = (θ a,0, θ a,1 ), where θ a,0 = P (Y = 0 ˆX = 0) an θ a,1 = P (Y = 1 ˆX = 1). Thus, given a privatize public variable input g(x (i), y (i) ; θ p ) {0, 1}, the output belief of the aversary guessing y (i) = 1 can be written as (1 θ a,0 )(1 g(x (i), y (i) ; θ p )) + θ a,1 g(x (i), y (i) ; θ p ). For PDD privacy mechanisms, we have θ p = s = {s 0,0, s 0,1, s 1,0, s 1,1 }. Given the fact that both x (i) an y (i) are binary, we use two simple neural networks to moel the privatizer an the aversary. As shown in Figure 4, the privatizer is moele as a two-layer neural network parameterize by s, while the aversary is moele as a two-layer neural network classifier. From the perspective of the privatizer, the belief of an aversary guessing y (i) = 1 conitione on the input (x (i), y (i) ) is given by where h(g(x (i), y (i) ; s); θ a ) = θ a,1 P (ˆx (i) = 1) + (1 θ a,0 )P (ˆx (i) = 0), (33) P (ˆx (i) = 1) =x (i) y (i) s 1,1 + (1 x (i) )y (i) (1 s 0,1 ) + x (i) (1 y (i) )s 1,0 + (1 x (i) )(1 y (i) )(1 s 0,0 ), P (ˆx (i) = 0) =x (i) y (i) (1 s 1,1 ) + (1 x (i) )y (i) s 0,1 + x (i) (1 y (i) )(1 s 1,0 ) + (1 x (i) )(1 y (i) )s 0,0. Furthermore, the expecte istortion is given by E D [(g(x, Y ; s), X)] = 1 n n [x (i) y (i) (1 s 1,1 ) + x (i) (1 y (i) )(1 s 1,0 ) (34) i=1 + (1 x (i) )y (i) (1 s 0,1 ) + (1 x (i) )(1 y (i) )(1 s 0,0 )]. Similar to the PDD case, we can also compute the belief of guessing y (i) = 1 conitional on the input (x (i), y (i) ) for the PDI schemes. Observe that in the PDI case, θ p = s = {s 0, s 1 }. Therefore, 14

15 we have h(g(x (i), y (i) ; s); θ a ) = θ a,1 [x (i) s 1 + (1 x (i) )(1 s 0 )] + (1 θ a,0 )[(1 x (i) )s 0 + x (i) (1 s 1 )]. (35) Uner PDI schemes, the expecte istortion is given by E D [(g(x, Y ; s), X)] = 1 n n [x (i) (1 s 1 ) + (1 x (i) )(1 s 0 )]. (36) i=1 Thus, we can use Algorithm 1 propose in Section 2.3 to learn the optimal PDD an PDI privacy mechanisms from the ataset. 3.3 Illustration of Results We now evaluate our propose GAP framework using synthetic atasets. We focus on the setting in which Y = X N, where N {0, 1} is a ranom variable inepenent of X an follows a Bernoulli istribution with parameter q. We generate two synthetic atasets with (p, q) equal to (0.75, 0.25) an (0.5, 0.25), respectively. Each synthetic ataset use in this experiment contains 10, 000 training samples an 2, 000 test samples. We use Tensorflow [2] to train both the privatizer an the aversary using Aam optimizer with a learning rate of 0.01 an a minibatch size of 200. The istortion constraint is enforce by the penalty metho provie in (13). Optimal probability of etection w.r.t. ifferent value of D for p=0.5, q= Optimal probability of etection w.r.t. ifferent value of D for p=0.75, q= Accuracy Accuracy Distortion (a) Performance of privacy mechanisms against MAP aversary for p = 0.5 Optimal mutual information privacy w.r.t istortion for p=0.5, q= Distortion (b) Performance of privacy mechanisms against MAP aversary for p = 0.75 Optimal mutual information privacy w.r.t istortion for p=0.75, q= privacy loss (bits) privacy loss (bits) Distortion (c) Performance of privacy mechanisms uner MI privacy metric for p = Distortion () Performance of privacy mechanisms uner MI privacy metric for p = 0.75 Figure 5: Privacy-istortion traeoff for binary ata moel Figure 5a illustrates the performance of both optimal PDD an PDI privacy mechanisms against a strong theoretical MAP aversary when (p, q) = (0.5, 0.25). It can be seen that the inference accuracy of the MAP aversary reuces as the istortion increases for both optimal PDD an PDI privacy mechanisms. As one woul expect, the PDD privacy mechanism achieves a lower 15

16 inference accuracy for the aversary, i.e., better privacy, than the PDI mechanism. Furthermore, when the istortion is higher than some threshol, the inference accuracy of the MAP aversary saturates regarless of the istortion. This is ue to the fact that the correlation between the private variable an the privatize public variable cannot be further reuce once the istortion is larger than the saturation threshol. Therefore, increasing istortion will not further reuce the accuracy of the MAP aversary. We also observe that the privacy mechanism obtaine via the ata-riven approach performs very well when pitte against the MAP aversary (maximum accuracy ifference aroun 3% compare to the theoretical approach). In other wors, for the binary ata moel, the ata-riven version of GAP can yiel privacy mechanisms that perform as well as the mechanisms compute uner the theoretical version of GAP, which assumes that the privatizer has access to the unerlying istribution of the ataset. Figure 5b shows the performance of both optimal PDD an PDI privacy mechanisms against the MAP aversary for (p, q) = (0.75, 0.25). Similar to the equal prior case, we observe that both PDD an PDI privacy mechanisms reuce the accuracy of the MAP aversary as the istortion increases an saturate when the istortion goes above a certain threshol. It can be seen that the saturation threshols for both PDD an PDI privacy mechanisms in Figure 5b are lower than the equal prior case plotte in Figure 5a. The reason is that when (p, q) = (0.75, 0.25), the correlation between Y an X is weaker than the equal prior case. Therefore, it requires less istortion to achieve the same privacy. We also observe that the performance of the GAP mechanism obtaine via the ata-riven approach is comparable to the mechanism compute via the theoretical approach. The performance of the GAP mechanism obtaine using the log-loss function (i.e., MI privacy) is plotte in Figure 5c an 5. Similar to the MAP aversary case, as the istortion increases, the mutual information between the private variable an the privatize public variable achieve by the optimal PDD an PDI mechanisms ecreases as long as the istortion is below some threshol. When the istortion goes above the threshol, the optimal privacy mechanism is able to make the private variable an the privatize public variable inepenent regarless of the istortion. Furthermore, the values of the saturation threshols are very close to what we observe in Figure 5a an 5b. 4 Binary Gaussian Mixture Moel Thus far, we have stuie a simple binary ataset moel. In many real atasets, the sample space of variables often takes more than just two possible values. It is well known that the Gaussian istribution is a flexible approximate for many istributions [89]. Therefore, in this section, we stuy a setting where Y {0, 1} an X is a Gaussian ranom variable whose mean an variance are epenent on Y. Without loss of generality, let E[X Y = 1] = E[X Y = 0] = µ an P (Y = 1) = p. Thus, X Y = 0 N ( µ, σ 2 0) an X Y = 1 N (µ, σ 2 1). Similar to the binary ata moel, we stuy two privatization schemes: (a) private-ata inepenent (PDI) schemes (where ˆX = g(x)), an (b) private-ata epenent (PDD) schemes (where ˆX = g(x, Y )). In orer to have a tractable moel for the privatizer, we assume g(x, Y ) is realize by aing an affine function of an inepenently generate ranom noise to the public variable X. The affine function enables controlling both the mean an variance of the privatize ata. In particular, we consier g(x, Y ) = X + (1 Y )β 0 Y β 1 + (1 Y )γ 0 N + Y γ 1 N, in which N is a one imensional ranom variable an β 0, β 1, γ 0, γ 1 are constant parameters. The goal of the privatizer is to sanitze the public ata X subject to the istortion constraint E ˆX,X ˆX X 2 2 D. 4.1 Theoretical Approach for Binary Gaussian Mixture Moel We now investigate the theoretical approach uner which both the privatizer an the aversary have access to P (X, Y ). To make the problem more tractable, let us consier a slightly simpler setting in which σ 0 = σ 1 = σ. We will relax this assumption later when we take a ata-riven approach. We further assume that N is a stanar Gaussian ranom variable. One might, rightfully, question our choice of focusing on aing (potentially Y -epenent) Gaussian noise. Though other istributions can be consiere, our approach is motivate by the following two reasons: (a) Even though it is known that aing Gaussian noise is not the worst case noise aing mechanism for non-gaussian X [74], ientifying the optimal noise istribution is mathematically intractable. Thus, for tractability an ease of analysis, we choose Gaussian noise. 16

17 (b) Aing Gaussian noise to each ata entry preserves the conitional Gaussianity of the release ataset. In what follows, we will analyze a variety of PDI an PDD mechanisms PDI Gaussian Noise Aing Privacy Mechanism We consier a PDI noise aing privatization scheme which as an affine function of the stanar Gaussian noise to the public variable. Since the privacy mechanism is PDI, we have g(x, Y ) = X +β +γn, where β an γ are constant parameters an N N (0, 1). Using the classical Gaussian hypothesis testing analysis [83], it is straightforwar to verify that the optimal inference accuracy (i.e., probability of etection) of the MAP aversary is given by P (G) = pq ( α α ln ( 1 p p )) + (1 p)q ( α 2 1 α ln ( 1 p p )), (37) 2µ where α = 1 u2 an Q(x) = γ2 +σ2 2π exp( x 2 )u. Moreover, since E ˆX,X [( ˆX, X)] = β 2 + γ 2, the istortion constraint is equivalent to β 2 + γ 2 D. Theorem 2. For a PDI Gaussian noise aing privatization scheme given by g(x, Y ) = X + β + γn, with β R an γ 0, the optimal parameters are given by β = 0, γ = D. (38) Let α = 2µ D+σ. For this optimal scheme, the accuracy of the MAP aversary is 2 P (G)* = pq ( α α ln ( 1 p p )) + (1 p)q ( α 2 1 α ln ( 1 p p )). (39) The proof of Theorem 2 is provie in Appenix B. We observe that the PDI Gaussian noise aing privatization scheme which minimizes the inference accuracy of the MAP aversary with istortion upper-boune by D is to a a zero-mean Gaussian noise with variance D PDD Gaussian Noise Aing Privacy Mechanism For PDD privatization schemes, we first consier a simple case in which γ 0 = γ 1 = 0. Without loss of generality, we assume that both β 0 an β 1 are non-negative. The privatize ata is given by ˆX = X+(1 Y )β 0 Y β 1. This is a PDD mechanism since ˆX epens on both X an Y. Intuitively, this mechanism privatizes the ata by shifting the two Gaussian istributions (uner Y = 0 an Y = 1) closer to each other. Uner this mechanism, it is easy to show that the aversary s MAP probability of inferring the private variable Y from ˆX is given by P (G) in (37) with α = 2µ (β1+β0) σ. Observe that since ( ˆX, X) = ((1 Y )β 0 Y β 1 ) 2, we have E ˆX,X [( ˆX, X)] = (1 p)β pβ 2 1. Thus, the istortion constraint implies (1 p)β pβ 2 1 D. Theorem 3. For a PDD privatization scheme given by g(x, Y ) = X +(1 Y )β 0 Y β 1, β 0, β 1 0, the optimal parameters are given by β0 pd = 1 p, (1 p)d β 1 =. (40) p For this optimal PDD privatization scheme, the accuracy of the MAP aversary is given by (37) with α = 2µ ( (1 p)d p + pd 1 p ) σ. The proof of Theorem 3 is provie in Appenix C. When P (Y = 1) = P (Y = 0) = 1 2, we have β 0 = β 1 = D, which implies that the optimal privacy mechanism for this particular case is to shift the two Gaussian istributions closer to each other equally by D regarless of the variance σ 2. When P (Y = 1) = p > 1 2, the Gaussian istribution with a lower prior probability, in this p 1 p case, X Y = 0, gets shifte times more than X Y = 1. Next, we consier a slightly more complicate case in which γ 0 = γ 1 = γ 0. Thus, the privacy mechanism is given by g(x, Y ) = X + (1 Y )β 0 Y β 1 + γn, where N N (0, 1). Intuitively, 17

18 this mechanism privatizes the ata by shifting the two Gaussian istributions (uner Y = 0 an Y = 1) closer to each other an aing another Gaussian noise N N (0, 1) scale by a constant γ. In this case, the MAP probability of inferring the private variable Y from ˆX is given by (37) with α = 2µ (β1+β0). Furthermore, the istortion constraint is equivalent to (1 p)β 2 γ2 +σ pβ1 2 + γ 2 D. Theorem 4. For a PDD privatization scheme given by g(x, Y ) = X + (1 Y )β 0 Y β 1 + γn with β 0, β 1, γ 0, the optimal parameters β 0, β 1, γ are given by the solution to min β 0,β 1,γ s.t. 2µ β 0 β 1 γ2 + σ 2 (41) (1 p)β pβ γ 2 D β 0, β 1, γ 0. Using this optimal scheme, the accuracy of the MAP aversary is given by (37) with α = 2µ β 0 β 1. (γ ) 2 +σ 2 Proof. Similar to the proofs of Theorem 2 an 3, we can compute the erivative of P (G) w.r.t. α. It is easy to verify that P (G) is monotonically increasing with α. Therefore, the optimal mechanism is given by the solution to (41). Substituting the optimal parameters into (37) yiels the MAP probability of inferring the private variable Y from ˆX. Remark: Note that the objective function in (41) only epens on β 0 + β 1 an γ. We efine β = β 0 + β 1. Thus, the above objective function can be written as min β,γ 2µ β γ2 + σ 2. (42) It is straightforwar to verify that the eterminant of the Hessian of (42) is always non-positive. Therefore, the above optimization problem is non-convex in β an γ. Finally, we consier the PDD Gaussian noise aing privatization scheme given by g(x, Y ) = X +(1 Y )β 0 Y β 1 +(1 Y )γ 0 N +Y γ 1 N, where N N (0, 1). This PDD mechanism is the most general one in the Gaussian noise aing setting an inclues the two previous mechanisms. The objective of the privatizer is to minimize the aversary s probability of correctly inferring Y from g(x, Y ) subject to the istortion constraint given by p((β 1 ) 2 + (γ 1 ) 2 ) + (1 p)((β 0 ) 2 + (γ 0 ) 2 ) D. As we have iscusse in the remark after Theorem 4, the problem becomes non-convex even for the simpler case in which γ 0 = γ 1 = γ. In orer to obtain the optimal parameters for this case, we first show that the optimal privacy mechanism lies on the bounary of the istortion constraint. Proposition 1. For the privacy mechanism given by g(x, Y ) = X+(1 Y )β 0 Y β 1 +(1 Y )γ 0 N + Y γ 1 N, the optimal parameters β 0, β 1, γ 0, γ 1 satisfy p((β 1) 2 + (γ 1) 2 ) + (1 p)((β 0) 2 + (γ 0) 2 ) = D. Proof. We prove the above statement by contraiction. Assume that the optimal parameters satisfy p((β1) 2 + (γ1) 2 ) + (1 p)((β0) 2 + (γ0) 2 ) < D. Let β 1 = β1 + c, where c > 0 is chosen so that p(( β 1 ) 2 + (γ1) 2 ) + (1 p)((β0) 2 + (γ0) 2 ) = D. Since the inference accuracy is monotonically ecreasing with β 1, the resultant inference accuracy can only be lower for replacing β1 with β 1. This contraicts with the assumption that p((β1) 2 + (γ1) 2 ) + (1 p)((β0) 2 + (γ0) 2 ) < D. Using the same type of analysis, we can show that any parameter that eviates from p((β1) 2 + (γ1) 2 ) + (1 p)((β0) 2 + (γ0) 2 ) = D is suboptimal. Let e 2 0 = (β0) 2 + (γ0) 2 an e 2 1 = (β1) 2 + (γ1) 2. Since the optimal parameters of the privatizer lie on the bounary of the istortion constraint, we have pe (1 p)e 2 0 = D. This implies (e 0, e 1 ) D 1 ɛ lies on the bounary of an ellipse parametrize by p an D. Thus, we have e 1 = 2 p 1+ɛ an 2 D ɛ e 0 = 2 1 p 1+ɛ, where ɛ [0, 1]. Therefore, the optimal parameters satisfy 2 [ (β0) 2 + (γ0) 2 D ɛ = 2 1 p 1 + ɛ 2 ]2, (β 1) 2 + (γ 1) 2 = [ D 1 ɛ 2 ]2 p 1 + ɛ 2. (43) 18

19 Privatizer Network Aversary Network Input Gaussian Noise Figure 6: Neural network structure of the privatizer an aversary for binary Gaussian mixture moel This implies (βi, γ i ), i {0, 1} lie on the bounary of two circles parametrize by D, p an ɛ. Thus, we can write β0, β1, γ0, γ1 as β0 = 2 D ɛ 1 w0 2 1 p 1 + ɛ w0 2, β1 D 1 ɛ 2 1 w1 2 = p 1 + ɛ w1 2, (44) γ0 = 4 D ɛ w 0 1 p 1 + ɛ w0 2, γ1 = 2 D 1 ɛ 2 w 1 p 1 + ɛ w1 2, where ɛ, w 0, w 1 [0, 1]. The optimal parameters β 0, β 1, γ 0, γ 1 can be compute by a gri search in the cube parametrize by ɛ, w 0, w 1 [0, 1] that minimizes the accuracy of the MAP aversary. In the following section, we will use this general PDD Gaussian noise aing privatization scheme in our ata-riven simulations an compare the performance of the privacy mechanisms obtaine by both theoretical an ata-riven approaches. 4.2 Data-riven Approach for Binary Gaussian Mixture Moel To illustrate our ata-riven GAP approach, we assume the privatizer only has access to the ataset D but oes not know the joint istribution of (X, Y ). Fining the optimal privacy mechanism becomes a learning problem. In the training phase, we use the empirical log-loss function L XE (h(g(x, Y ; θ p ); θ a ), Y ) provie in (11) for the aversary. Thus, for a fixe privatizer parameter θ p, the aversary learns the optimal parameter θ a that maximizes L XE (h(g(x, Y ; θ p ); θ a ), Y ). On the other han, the optimal parameter for the privacy mechanism is obtaine by solving (10). After convergence, we use the learne ata-riven GAP mechanism to compute the accuracy of inferring the private variable uner a strong MAP aversary. We evaluate our ata-riven approach by comparing the mechanisms learne in an aversarial fashion on D with the game-theoretically optimal ones in which both the aversary an privatizer are assume to have access to P (X, Y ). We consier the PDD Gaussian noise aing privacy mechanism given by g(x, Y ) = X + (1 Y )β 0 Y β 1 + (1 Y )γ 0 N + Y γ 1 N. Similar to the binary setting, we use two neural networks to moel the privatizer an the aversary. As shown in Figure 6, the privatizer is moele by a two-layer neural network with parameters β 0, β 1, γ 0, γ 1 R. The aversary, whose goal is to infer Y from privatize ata ˆX, is moele by a three-layer neural network classifier with leaky ReLU activations. The ranom noise is rawn from a stanar Gaussian istribution N N (0, 1). In orer to enforce the istortion constraint, we use the augmente Lagrangian metho to penalize the learning objective when the constraint is not satisfie. In the binary Gaussian mixture moel setting, the augmente Lagrangian metho uses two parameters, namely λ t an ρ t to approximate the constraine optimization problem by a series of unconstraine problems. Intuitively, a large value of ρ t enforces the istortion constraint to be bining, whereas λ t is an estimate of the Lagrangian multiplier. To obtain the optimal solution of the constraine optimization problem, we solve a series of unconstraine problems given by (14). 19

20 Table 1: Synthetic atasets Dataset P (Y = 1) X Y = 0 X Y = N ( 3, 1) N (3, 1) N ( 3, 4) N (3, 1) N ( 3, 1) N (3, 1) N ( 3, 4) N (3, 1) 4.3 Illustration of Results We use synthetic atasets to evaluate our propose GAP framework. We consier four synthetic atasets shown in Table 1. Each synthetic ataset use in this experiment contains 20, 000 training samples an 2, 000 test samples. We use Tensorflow to train both the privatizer an the aversary using Aam optimizer with a learning rate of 0.01 an a minibatch size of 200. Optimal probability of etection w.r.t. ifferent value of D for p=0.5 1 Optimal probability of etection w.r.t. ifferent value of D for p= Accuracy 0.7 Accuracy Distortion (a) Performance of PDD mechanisms against MAP aversary for p = Distortion (b) Performance of PDD mechanisms against MAP aversary for p = 0.75 Figure 7: Privacy-istortion traeoff for binary Gaussian mixture moel Figure 7a an 7b illustrate the performance of the optimal PDD Gaussian noise aing mechanisms against the strong theoretical MAP aversary when P (Y = 1) = 0.5 an P (Y = 1) = 0.75, respectively. It can be seen that the optimal mechanisms obtaine by both theoretical an atariven approaches reuce the inference accuracy of the MAP aversary as the istortion increases. Similar to the binary ata moel, we observe that the accuracy of the aversary saturates when the istortion crosses some threshol. Moreover, it is worth pointing out that for the binary Gaussian mixture setting, we also observe that the privacy mechanism obtaine through the ata-riven approach performs very well when pitte against the MAP aversary (maximum accuracy ifference aroun 6% compare with theoretical approach). In other wors, for the binary Gaussian mixture moel, the ata-riven approach for GAP can generate privacy mechanisms that are comparable, in terms of performance, to the theoretical approach, which assumes the privatizer has access to the unerlying istribution of the ata. Figures 8 to 13 show the privatization schemes for ifferent atasets. The intuition of this Gaussian noise aing mechanism is to shift istributions of X Y = 0 an X Y = 1 closer an scale the variances to preserve privacy. When P (Y = 0) = P (Y = 1) an σ 0 = σ 1, the privatizer shifts an scales the two istributions almost equally. Furthermore, the resultant ˆX Y = 0 an ˆX Y = 1 have very similar istributions. We also observe that if P (Y = 0) P (Y = 1), the public variable whose corresponing private variable has a lower prior probability gets shifte more. It is also worth mentioning that when σ 0 σ 1, the public variable with a lower variance gets scale more. The optimal privacy mechanisms obtaine via the ata-riven approach uner ifferent atasets are presente in Tables 2 to 5. In each table, D is the maximum allowable istortion. β 0, β 1, γ 0, an γ 1 are the parameters of the privatizer neural network. These learne parameters ictate the statistical moel of the privatizer, which is use to sanitize the ataset. We use acc to enote the inference accuracy of the aversary using a test ataset an xent to enote the converge cross- 20

21 entropy of the aversary. The column title istance represents the average istortion E D X ˆX 2 that results from sanitizing the test ataset via the learne privatization scheme. P etect is the MAP aversary s inference accuracy uner the learne privatization scheme, assuming that the aversary: (a) has access to the joint istribution of (X, Y ), (b) has knowlege of the learne privatization scheme, an (c) can compute the MAP rule. P etect-theory is the lowest inference accuracy we get if the privatizer ha access to the joint istribution of (X, Y ), an use this information to compute the parameters of the privatization scheme base on the approach provie at the en of Section Figure 8: Raw test samples, equal variance (a) D = 1 (b) D = 3 (c) D = 8 Figure 9: Prior P (Y = 1) = 0.5, X Y = 1 N(3, 1), X Y = 0 N( 3, 1) (a) D = 1 (b) D = 3 (c) D = 8 Figure 10: Prior P (Y = 1) = 0.75, X Y = 1 N(3, 1), X Y = 0 N( 3, 1) 21

22 Figure 11: Raw test samples, unequal variance (a) D = 1 (b) D = 3 (c) D = 8 Figure 12: Prior P (Y = 1) = 0.5, X Y = 1 N(3, 1), X Y = 0 N( 3, 4) (a) D = 1 (b) D = 3 (c) D = 8 Figure 13: Prior P (Y = 1) = 0.75, X Y = 1 N(3, 1), X Y = 0 N( 3, 4) Table 2: Prior P (Y = 1) = 0.5, X Y = 1 N(3, 1), X Y = 0 N( 3, 1) D β 0 β 1 γ 0 γ 1 acc xent istance P etect P etect theory

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation Relative Entropy an Score Function: New Information Estimation Relationships through Arbitrary Aitive Perturbation Dongning Guo Department of Electrical Engineering & Computer Science Northwestern University

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Expected Value of Partial Perfect Information

Expected Value of Partial Perfect Information Expecte Value of Partial Perfect Information Mike Giles 1, Takashi Goa 2, Howar Thom 3 Wei Fang 1, Zhenru Wang 1 1 Mathematical Institute, University of Oxfor 2 School of Engineering, University of Tokyo

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Estimating Causal Direction and Confounding Of Two Discrete Variables

Estimating Causal Direction and Confounding Of Two Discrete Variables Estimating Causal Direction an Confouning Of Two Discrete Variables This inspire further work on the so calle aitive noise moels. Hoyer et al. (2009) extene Shimizu s ientifiaarxiv:1611.01504v1 [stat.ml]

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy IPA Derivatives for Make-to-Stock Prouction-Inventory Systems With Backorers Uner the (Rr) Policy Yihong Fan a Benamin Melame b Yao Zhao c Yorai Wari Abstract This paper aresses Infinitesimal Perturbation

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection Online Appenix for Trae Policy uner Monopolistic Competition with Firm Selection Kyle Bagwell Stanfor University an NBER Seung Hoon Lee Georgia Institute of Technology September 6, 2018 In this Online

More information

Modeling the effects of polydispersity on the viscosity of noncolloidal hard sphere suspensions. Paul M. Mwasame, Norman J. Wagner, Antony N.

Modeling the effects of polydispersity on the viscosity of noncolloidal hard sphere suspensions. Paul M. Mwasame, Norman J. Wagner, Antony N. Submitte to the Journal of Rheology Moeling the effects of polyispersity on the viscosity of noncolloial har sphere suspensions Paul M. Mwasame, Norman J. Wagner, Antony N. Beris a) epartment of Chemical

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Level Construction of Decision Trees in a Partition-based Framework for Classification

Level Construction of Decision Trees in a Partition-based Framework for Classification Level Construction of Decision Trees in a Partition-base Framework for Classification Y.Y. Yao, Y. Zhao an J.T. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canaa S4S

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation A Novel ecouple Iterative Metho for eep-submicron MOSFET RF Circuit Simulation CHUAN-SHENG WANG an YIMING LI epartment of Mathematics, National Tsing Hua University, National Nano evice Laboratories, an

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS Yannick DEVILLE Université Paul Sabatier Laboratoire Acoustique, Métrologie, Instrumentation Bât. 3RB2, 8 Route e Narbonne,

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Local Linear ICA for Mutual Information Estimation in Feature Selection

Local Linear ICA for Mutual Information Estimation in Feature Selection Local Linear ICA for Mutual Information Estimation in Feature Selection Tian Lan, Deniz Erogmus Department of Biomeical Engineering, OGI, Oregon Health & Science University, Portlan, Oregon, USA E-mail:

More information

On the Aloha throughput-fairness tradeoff

On the Aloha throughput-fairness tradeoff On the Aloha throughput-fairness traeoff 1 Nan Xie, Member, IEEE, an Steven Weber, Senior Member, IEEE Abstract arxiv:1605.01557v1 [cs.it] 5 May 2016 A well-known inner boun of the stability region of

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

Similarity Measures for Categorical Data A Comparative Study. Technical Report

Similarity Measures for Categorical Data A Comparative Study. Technical Report Similarity Measures for Categorical Data A Comparative Stuy Technical Report Department of Computer Science an Engineering University of Minnesota 4-92 EECS Builing 200 Union Street SE Minneapolis, MN

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

WUCHEN LI AND STANLEY OSHER

WUCHEN LI AND STANLEY OSHER CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS. Mauro Boccadoro Magnus Egerstedt Paolo Valigi Yorai Wardi

BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS. Mauro Boccadoro Magnus Egerstedt Paolo Valigi Yorai Wardi BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS Mauro Boccaoro Magnus Egerstet Paolo Valigi Yorai Wari {boccaoro,valigi}@iei.unipg.it Dipartimento i Ingegneria Elettronica

More information

Optimal Signal Detection for False Track Discrimination

Optimal Signal Detection for False Track Discrimination Optimal Signal Detection for False Track Discrimination Thomas Hanselmann Darko Mušicki Dept. of Electrical an Electronic Eng. Dept. of Electrical an Electronic Eng. The University of Melbourne The University

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

arxiv: v5 [cs.lg] 28 Mar 2017

arxiv: v5 [cs.lg] 28 Mar 2017 Equilibrium Propagation: Briging the Gap Between Energy-Base Moels an Backpropagation Benjamin Scellier an Yoshua Bengio * Université e Montréal, Montreal Institute for Learning Algorithms March 3, 217

More information

Generalizing Kronecker Graphs in order to Model Searchable Networks

Generalizing Kronecker Graphs in order to Model Searchable Networks Generalizing Kronecker Graphs in orer to Moel Searchable Networks Elizabeth Boine, Babak Hassibi, Aam Wierman California Institute of Technology Pasaena, CA 925 Email: {eaboine, hassibi, aamw}@caltecheu

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Switching Time Optimization in Discretized Hybrid Dynamical Systems Switching Time Optimization in Discretize Hybri Dynamical Systems Kathrin Flaßkamp, To Murphey, an Sina Ober-Blöbaum Abstract Switching time optimization (STO) arises in systems that have a finite set

More information

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Ashish Goel Michael Kapralov Sanjeev Khanna Abstract We consier the well-stuie problem of fining a perfect matching in -regular bipartite

More information

KNN Particle Filters for Dynamic Hybrid Bayesian Networks

KNN Particle Filters for Dynamic Hybrid Bayesian Networks KNN Particle Filters for Dynamic Hybri Bayesian Networs H. D. Chen an K. C. Chang Dept. of Systems Engineering an Operations Research George Mason University MS 4A6, 4400 University Dr. Fairfax, VA 22030

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties Journal of Machine Learning Research 16 (2015) 1547-1572 Submitte 1/14; Revise 9/14; Publishe 8/15 Flexible High-Dimensional Classification Machines an Their Asymptotic Properties Xingye Qiao Department

More information

Crowdsourced Judgement Elicitation with Endogenous Proficiency

Crowdsourced Judgement Elicitation with Endogenous Proficiency Crowsource Jugement Elicitation with Enogenous Proficiency Anirban Dasgupta Arpita Ghosh Abstract Crowsourcing is now wiely use to replace jugement or evaluation by an expert authority with an aggregate

More information

Situation awareness of power system based on static voltage security region

Situation awareness of power system based on static voltage security region The 6th International Conference on Renewable Power Generation (RPG) 19 20 October 2017 Situation awareness of power system base on static voltage security region Fei Xiao, Zi-Qing Jiang, Qian Ai, Ran

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Sparse Reconstruction of Systems of Ordinary Differential Equations

Sparse Reconstruction of Systems of Ordinary Differential Equations Sparse Reconstruction of Systems of Orinary Differential Equations Manuel Mai a, Mark D. Shattuck b,c, Corey S. O Hern c,a,,e, a Department of Physics, Yale University, New Haven, Connecticut 06520, USA

More information

Fast image compression using matrix K-L transform

Fast image compression using matrix K-L transform Fast image compression using matrix K-L transform Daoqiang Zhang, Songcan Chen * Department of Computer Science an Engineering, Naning University of Aeronautics & Astronautics, Naning 2006, P.R. China.

More information

Lecture 5. Symmetric Shearer s Lemma

Lecture 5. Symmetric Shearer s Lemma Stanfor University Spring 208 Math 233: Non-constructive methos in combinatorics Instructor: Jan Vonrák Lecture ate: January 23, 208 Original scribe: Erik Bates Lecture 5 Symmetric Shearer s Lemma Here

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, an Tony Wu Abstract Popular proucts often have thousans of reviews that contain far too much information for customers to igest. Our goal for the

More information

Mark J. Machina CARDINAL PROPERTIES OF "LOCAL UTILITY FUNCTIONS"

Mark J. Machina CARDINAL PROPERTIES OF LOCAL UTILITY FUNCTIONS Mark J. Machina CARDINAL PROPERTIES OF "LOCAL UTILITY FUNCTIONS" This paper outlines the carinal properties of "local utility functions" of the type use by Allen [1985], Chew [1983], Chew an MacCrimmon

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

New Bounds for Distributed Storage Systems with Secure Repair

New Bounds for Distributed Storage Systems with Secure Repair New Bouns for Distribute Storage Systems with Secure Repair Ravi Tanon 1 an Soheil Mohajer 1 Discovery Analytics Center & Department of Computer Science, Virginia Tech, Blacksburg, VA Department of Electrical

More information

On Characterizing the Delay-Performance of Wireless Scheduling Algorithms

On Characterizing the Delay-Performance of Wireless Scheduling Algorithms On Characterizing the Delay-Performance of Wireless Scheuling Algorithms Xiaojun Lin Center for Wireless Systems an Applications School of Electrical an Computer Engineering, Purue University West Lafayette,

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

arxiv: v1 [hep-lat] 19 Nov 2013

arxiv: v1 [hep-lat] 19 Nov 2013 HU-EP-13/69 SFB/CPP-13-98 DESY 13-225 Applicability of Quasi-Monte Carlo for lattice systems arxiv:1311.4726v1 [hep-lat] 19 ov 2013, a,b Tobias Hartung, c Karl Jansen, b Hernan Leovey, Anreas Griewank

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Lyapunov Functions. V. J. Venkataramanan and Xiaojun Lin. Center for Wireless Systems and Applications. School of Electrical and Computer Engineering,

Lyapunov Functions. V. J. Venkataramanan and Xiaojun Lin. Center for Wireless Systems and Applications. School of Electrical and Computer Engineering, On the Queue-Overflow Probability of Wireless Systems : A New Approach Combining Large Deviations with Lyapunov Functions V. J. Venkataramanan an Xiaojun Lin Center for Wireless Systems an Applications

More information

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection an System Ientification Borhan M Sananaji, Tyrone L Vincent, an Michael B Wakin Abstract In this paper,

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

arxiv: v1 [cs.it] 21 Aug 2017

arxiv: v1 [cs.it] 21 Aug 2017 Performance Gains of Optimal Antenna Deployment for Massive MIMO ystems Erem Koyuncu Department of Electrical an Computer Engineering, University of Illinois at Chicago arxiv:708.06400v [cs.it] 2 Aug 207

More information

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency Transmission Line Matrix (TLM network analogues of reversible trapping processes Part B: scaling an consistency Donar e Cogan * ANC Eucation, 308-310.A. De Mel Mawatha, Colombo 3, Sri Lanka * onarecogan@gmail.com

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device Close an Open Loop Optimal Control of Buffer an Energy of a Wireless Device V. S. Borkar School of Technology an Computer Science TIFR, umbai, Inia. borkar@tifr.res.in A. A. Kherani B. J. Prabhu INRIA

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Topic Modeling: Beyond Bag-of-Words

Topic Modeling: Beyond Bag-of-Words Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while

More information

On the Value of Partial Information for Learning from Examples

On the Value of Partial Information for Learning from Examples JOURNAL OF COMPLEXITY 13, 509 544 (1998) ARTICLE NO. CM970459 On the Value of Partial Information for Learning from Examples Joel Ratsaby* Department of Electrical Engineering, Technion, Haifa, 32000 Israel

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information