arxiv: v1 [cs.lg] 8 Jan 2019

Size: px
Start display at page:

Download "arxiv: v1 [cs.lg] 8 Jan 2019"

Transcription

1 Data Masking with Privacy Guarantees Anh T. Pha Oregon State University Shalini Ghosh Sasung Research Vinod Yegneswaran SRI international arxiv:90.085v [cs.lg] 8 Jan 09 Abstract We study the proble of data release with privacy, where data is ade available with privacy guarantees while keeping the usability of the data as high as possible this is iportant in health-care and other doains with sensitive data. In particular, we propose a ethod of asking the private data with privacy guarantee while ensuring that a classifier trained on the asked data is siilar to the classifier trained on the original data, to aintain usability. We analyze the theoretical risks of the proposed ethod and the traditional input perturbation ethod. Results show that the proposed ethod achieves lower risk copared to the input perturbation, especially when the nuber of training saples gets large. We illustrate the effectiveness of the proposed ethod of data asking for privacy-sensitive learning on benchark datasets. Introduction In doains like healthcare or finance, data can be sensitive and private. There are several scenarios where a dataset needs to be shared while protecting sensitive parts of the data. For exaple, consider a edical study where a group of patients with a particular edical condition are being studied. The identifying data of soe patients (e.g., those with a rare disease) ay need to be asked while sharing their records with a wider group of edical researchers. However, when the patient records are processed by clinical decision support tools, we want the achine learning (ML) odels in the tools to have siilar perforance on the asked data as they would on the original data. Several approaches have been proposed to preserve privacy of data, e.g., by anonyization (Saarati and Sweeney 998), by generalization (Mohaed et al. 0). Methods for differential-privacy include adding Laplace-noise (Sarwate and Chaudhuri 03), odifying the objective (Chaudhuri and Monteleoni 009), and posterior sapling (Diitrakakis et al. 04; Wang, Fienberg, and Sola 05). Privacy-preserving data publishing transfors sensitive data to protect it against privacy attacks while supporting effective data ining tasks (Fung et al. 00). Differentially private data release (Mohaed et al. 0) presents an anonyization algorith that satisfies the ɛ differential privacy odel, while other ethods of data release (Chen et al. 0; Xiao, Xiong, and Yuan 00) group the data and add noise to the partition counts. However, these techniques don t explicitly try to aintain the accuracy of a odel. Our approach asks training saples with less sensitive ones with privacy guarantee, while ensuring that the classifier trained on the asked data reaches accuracy siilar to the classifier trained on the original data. Moreover, copared to publishing asked classifier, publishing asked data enables other types of classifiers to be trained by the user. There are also query-based data asking ethods for a classifier, which are sparse vector techniques for generating asked data using a query that the gradient of the asked data is zero (Dwork, Roth, and others 04; Lyu, Su, and Li 07; Lee and Clifton 04; Blu, Ligett, and Roth 008). However, when the gradient coputation is coplicated, designing a ethod to achieve a zero gradient can be tricky. We have three ain contributions in this paper. First, we propose a novel algorith of data asking for privacysensitive learning. Second, we provide a theoretical guarantee explaining why the proposed ethod is ore suitable for a large nuber of training saples than a traditional input perturbation ethod. Finally, we illustrate the efficacy of our ethod considering logistic regression as an exaple classifier, on both synthetic and benchark datasets. Proble setting Goal: Assue we train a odel paraeterized by w R d on a dataset D train = {x i, y i } N i=, where x i X = R d, y i Y = {, }, and d is the nuber of features. The goal of our data publishing algorith A is generating a asked training dataset D asked = {x i, y i} N i=, where x i X, such that: (a) D asked is as different as possible fro D train, but (b) the odel trained on D asked gives us paraeters w that are close to the original paraeters w of the odel trained on D train. This paper outlines an approach for achieving this goal. Before that, we review several concepts of data publishing with privacy and the core forulation of logistic regression. Data publishing with differential privacy (DPDP) We first begin with the concept of data publishing with differential privacy (DPDP). We consider two datasets of N training saples, D train = {x i, y i } N i= and D train =

2 {x i, y i } N i=, which are different at only one saple: without loss of generality, assue x i = x i and y i = y i for i = {,,..., N }, and x N x N and (or) y N y N. A data publishing algorith A is said to be ɛ-private (Dwork 008) if p ( A(D train ) = O ) p(a ( D train ) = O ) < eɛ, where O = {x i, y i} N i= is a particular output of the data publishing algorith A. Intuitively, differential privacy guarantees that for sall ɛ, the output of A is not sensitive to the existence of a single saple in the dataset. In this setting, the attacker has less chance to infer details about a particular training saple in the data. In this work we focus on differential privacy for asked data generation where the achine learning algorith we consider is logistic regression (Walker and Duncan 967). Core forulation of logistic regression Logistic Regression: We are given a training dataset D train. The goal for training a logistic regression classifier is finding a apping function between a saple in R d and a label in {, }. Specifically, we odel the relation aong a saple x i and its label y i as p(y i x i, w) = eyiwt x i + e yiwt x i. Assuing saples are i.i.d., the log-likelihood for the training saples is L λ (w) = N [ (yi w T ) x i log(+e y iw T x i )]+λ w, N i= () with λ is the regularization paraeter and denotes the -nor. Training logistic regression: In logistic regression, training is done by finding the paraeter w that axiizes the loglikelihood in (), i.e., the gradient of L λ at w is 0, as follows: L λ (w) w [ N [ ] y i p(y i = x i, w) ]x i + λw = 0. = N i= () For various logistic regression optiization techniques to ake the above gradient 0, please refer to (Minka 003). DPDP by Masked Data Generation In this section, we describe how to generate asked saples for logistic regression. Adding Laplace noise to the classifier Unlike previous approaches of adding noise to the data then publishing noisy data, we consider a novel approach: we first In this work we consider logistic regression as the classifier. The work flow of data publishing for other classifiers, e.g., SVM, is siilar to that of the proposed ethod. {x, y } {x, y } {x N, y N } {x N, y N } Logistic Regression Laplace noise {x, y } {x, y } {x, y } {x N, y N } {x N, y N } w Laplace noise w (a) The proposed asked data generation {x, y } {x N, y N } {x N, y N } (b) The input perturbation ethod Masked Data Generation {x, y } {x, y } {x N, y N } {x N, y N } Figure : The work flow of the proposed ethod and traditional data publishing ethods. train a classifier on the original data, and then add Laplace noise to the classifier. The otivation for adding noise is that in differential privacy, the goal is to ake siilar output for any two neighbor datasets D and D so that attacker cannot infer about the existence of any single training saple. Since the classifiers trained on two datasets D and D are not equal, adding Laplace noise to the paraeters of those classifiers would account for that difference, and with soe probability those classifiers after adding noise would be equal. Subsequently, we generate and publish a asked dataset such that the gradient of the log-likelihood for the noisy classifier is 0. The work flow of the proposed fraework is illustrated in Fig. (a). In coparison, the work flow of traditional data publishing ethods by perturbation is shown in Fig. (b). Generating asked data We generate asked data O = {x i }N i= such that the gradient of the log-likelihood of O for the aforeentioned noisy classifier w is 0. The optial condition for asked data is the following: N [ N i= [ ] ] y i p(y i = x i, w ) x i + λw = 0, (3) where the asked saples {x i, y i }(s) are unknown. To evaluate the optiality of the set S of asked saples w.r.t. w, we use the -nor of the gradient: N ( ) N (S) = y i p(y i = x i, w )x i + Nλw i= We start with an initial set of training saples S then iteratively add new saple to S. The criteria to evaluate the new saple is the -nor of the gradient of S after including the new saple. Algorith outlines our proposed Masked Data Generation algorith. The algorith terinates when the nuber of saples in S reaches N.

3 Algorith Masked Data Generation Input: N training saples D train = {x i } N i=, ɛ Output: N asked training saples O = {x i }N i=. Step : Train Logistic regression classifier w, as in (). Step : Add Laplace noise to the classifier w = w + η, where η c e λnɛ η, where c is a noralized constant. Step 3: S = { }. Increentally generate asked saples. while cardinality(s) N do Find an outliers {x } reducing the -nor of the gradient of S the ost, using Gradient Descent (5) Add the new saple S = S {x } end while Return O = S; Iteratively generating asked saples In this section, we present the gradient descent ethod to iteratively generate asked saples. In particular, given the current set of asked saples S, we need to find the next asked saple {x } such that the -nor of the gradient of the set S {x } is close to 0 as possible. For siplicity of notation, denote x (y i S i p(y i = ) x i, w)x i + Nλw = g as the current gradient of the current asked saples. Consequently, we need to find the next asked saple {x } iniizing the following objective N (x ) = ( x y eywt ) + e wt x x + g. (4) To iniize (4), we use backtracking gradient descent. The gradient is coputed as N (x ( ) (y x x = eywt ) + e wt x x + g) x ((y ewt )I w T x e wt x ) + e wt x, (5) ( + e wt x ) where I is the identity atrix in R d d. Note that, we can generalize our algorith to C classes, with C >, as follows N x = C c= ) ([I(y = c) p(y = c x, w)]x + g c ( [I(y = c) p(y = c x, w)] C ) [p(y = c x, w)p(y = l x, w)(w c w l ) T x )]. l= Coputational coplexity: The coputational coplexity of the proposed algorith is linear in ter of nuber of added saples. Intuition: Most differential privacy algoriths for data publishing odify the data by adding unifor noise, e.g., as in Fig. (b), which ay change the original data anifold closer to a unifor anifold and ay not be optiized for any particular achine learning odel. Coparison to classifier publishing: The proposed approach has an advantage over other traditional approaches. In particular, assuing a non-epty initialized set of training saples S in Step 3 of Algorith, the proposed ethod adds fake saples with copletely different anifold to the dataset. For exaple, assue we want to preserve the privacy of a dataset consisting of non-diabetes patients and sensitive type- diabetes patients. We can initially add nonsensitive type- diabetes data saples, thereby preserving the privacy of the type- diabetes patients. Moreover, by iteratively adding asked saples, a classifier that is trained on the original data will be quite close to the classifier trained on the new asked data. Copared to publishing the noisy classifier as in (Chaudhuri and Monteleoni 009), the proposed data asking ethod allows users to benefit fro real data, i.e., in this case non-diabetes and type- diabetes data, and train other types of classifiers on the. Privacy guarantee of Masked Data Generation There are two aspects of a data publishing algorith. First, we need to guarantee that the algorith is ɛ-private. In particular, is the algorith sensitive to the existence of a single saple in two datasets that are different only at that saple? Second, we would like to assess how the utility of the published dataset changes with changing ɛ. The following Proposition answers the first question. Proposition If x i, i, then Algorith is ɛ- private. Utility of Masked Data Generation with changing ɛ We next consider the utility aspect of the asked dataset O with different values of ɛ. We consider the utility of the published data to be how well the classifier trained on the published data is close to the classifier trained on the original data. Let us suppose that training logistic regression on the original dataset D train and the asked dataset O gives us paraeters w and w, respectively. We are interested in coparing the 0/ risk (Vapnik and Vapnik 998) of the classifier trained on asked data (w ), to the 0/ risk of the classifier trained on original data (w). Note that logistic regression is classification calibrated (Bartlett, Jordan, and McAuliffe 006), which eans that iniizing the negative log-likelihood leads to iniizing the 0/ risk. Thus, it is sufficient to copare the log-likelihood L λ of w copared to that of w. Proposition With probability δ, L λ (w ) L λ (w) ( d log( d δ ) λnɛ ) (λ + ). Fro Lea, the classifier trained on asked data iproves when N is larger.

4 DPDP by Input Perturbation In this section, we consider a classical and natural algorith to publish data (Sarwate and Chaudhuri 03; Mivule 0). The algorith is quite siple: it directly adds noise η e ɛ η to each input saple. The detailed algorith is shown in Algorith. Siilar to Algorith, in the rest of this section we consider the privacy and the utility of the input perturbation algorith when ɛ changes. Privacy guarantee of Input Perturbation We first show that Algorith is ɛ-private. Proposition 3 If x i, i, then algorith is ɛ-private. Algorith Input Perturbation Input: N training saples D train = {x k } N k=, ɛ Output: N asked training saples O = {x k }N k= while k < N do η e ɛ η x k = x k + η k = k + end while Return O = {x k }N k= Utility of Input Perturbation with changing ɛ Siilar to Section, we consider the log-likelihood of the classifier w trained on perturbed data. We are going to bound the log-likelihood w.r.t. the original data L λ (w ) L λ (w). We begin with the following Proposition. Lea 4 (Chaudhuri and Monteleoni 009). Let G(w) be a convex function and g(w) be a function with g(w) g and in v in w v T (G + g)(w)v G. Let w = arg in G(w) and w = arg in G(w) + g(w). Then w w g G. Proposition 5 With probability δ, L λ (w ) L λ (w) ( d log d δ λɛ ) (λ + ). Fro Proposition 5, the classifier trained on perturbed data does not iprove when N is larger, as we see in Proposition. Experients We copare the perforance of our Masked Data Generation ethod in Algorith to the Input Perturbation ethod in Algorith, on both synthetic and real datasets. Results on toy data Datasets: In this section, the effectiveness of the proposed ethod is illustrated on a D toy dataset. We saple 00 training saples fro three noral distributions. The st class coes fro N ([0;.5] T, 0.5I), the nd class coes fro N ([; ] T, 0.5I), and the 3rd class coes fro N ([; ] T, 0.5I), as shown in Fig. (a). Assue that saples fro the 3rd class is sensitive. Setting: We initialize the saples in the asked dataset fro a class with a different anifold for the 3rd class. In particular, we first add to the published dataset a fake class 3 with a totally different distribution anifold fro the original class 3, e.g., N ([; ] T, 0.5I) instead of N ([; ] T, 0.5I), as shown in Fig. (b). We then run the asked data generation ethod with non-epty training saples set S as in Algorith. Results: The saples generated fro the proposed ethod are shown in Fig. (c). Fro Fig. (c), to accoodate for the shift in distribution anifold of class 3 fro [; ] to [; ], any other fake saples of class 3 are added in the botto of Fig. (c).fro Fig. (c), we observe the usefulness of regularization, since less asked saples are on the boundary. Fro Fig. (c) and Fig. (a), the generated saples fro class 3 is significantly different fro the original true saples fro class 3, which iplies that the data is private. However, the resulting classifier or the boundary learned fro the three classes are alost siilar for original data and published data. As a result, users are still able to access original real data fro classes and, and at the sae tie achieve the classifier for class 3 which is private now. Results on MNIST digits data In this section, we consider the effectiveness of the proposed algorith on the MNIST handwritten digit dataset. Datasets: We use PCA to reduce the diensionality of the data to 5. Siilar to the toy exaple, we select saples fro three digits, e.g., three digits {0,, 3} as in Fig. 3(a), and three digits {0,, 4} as in Fig. 4(a). The corresponding classifier learned fro three digits {0,, 3} is shown in Fig. 3(d), and fro three digits {0,, 4} is shown in Fig. 4(d). Fro those figures, e.g., in Fig. 3(d), the visualized classifier represents the three corresponding digits {0,, 3}. Setting: We first explain how to generate a non-epty initially asked training saples S in Algorith. In particular, the first two digits fro the initially asked training saples are the sae as the two digits of the original training saples. For exaple, we still uses saples fro digits 0 and for initially asked training saples as in Fig. 3(b). However, for the last digit of the initially asked training saples, we use a totally different digit fro the last digit of the true training saples. For exaple, we use digit 6 instead of digit 3 as the last digit as in Fig. 3(b). The corresponding classifier learned fro the initially asked training saples S is visualized in Fig. 4(e). For visualization of a classifier, e.g., in Fig. 3(d-f), we project the classifier of each class back to the two diensional space.

5 (a) True training saples -4-4 (b) Initially asked saples set S -4-4 (c) Final asked saples Figure : (a) Original training saples, (b) Initially asked training saples, (c) Final asked saples using Algorith (a) True training saples 3 3 (b) Initially asked saples set S (c) Final asked saples using Algorith (d) True training w (e) Initially asked w set S (f) Masked w Figure 3: (a) True training saples, (b) Initially asked saples in S, (c) Final asked saples using Algorith, and their corresponding w s visualization (d-f) on MNIST datasets. We have digits fro 0,, 3 and we would like to replace 3 with 6 using soe fake saples (a) True training saples 3 3 (b) Initially asked saples set S (c) Final asked saples using Algorith (d) True training w (e) Initially asked w (f) Masked w Figure 4: (a) True training saples, (b) Initially asked saples in S, and (c) Final asked saples using Algorith, and their corresponding w s visualization (d-f) on MNIST datasets. We have digits fro 0,, 4 and we would like to replace 4 with 8 using soe fake saples.

6 Result: We then iteratively add asked training saples into S using the asked data generation ethod in Algorith. The asked saples generated by Algorith into S are shown in Fig. 3(c). Note that several saples aong the reove the effect of digit 6, e.g., the 4th saple fro the left in the first row of Fig. 3(c). On another hand, several aong the add the effect of digit 3 back to the classifier, e.g., the iage at the botto right of Fig. 3(c). Moreover, because of the adding asked saples, the classifier learned fro the asked training saples is siilar to the original classifier learned fro the original training saples. For exaple, the classifier in Fig. 3(f) is siilar to the classifier in Fig. 3(d). A siilar visualization exaple is shown in Fig. 4, where the original training saples are digit {0,, 4} as in Fig. 4(a), the initially asked training saples in S are digit {0,, 8} as in Fig. 4(b), and after generating asked saples, the classifier of the asked data as in Fig. 4(f) is siilar to the classifier of original data as in Fig. 4(d). Results on UCI datasets Datasets: We deonstrate the effectiveness of the proposed ethod on several UCI datasets in sensitive doains. Evaluation Measure: For all datasets, we uniforly select a validation set V = {x i } n val i= of saples fro two classes. We denote the ground truth labels for these saples as L true. Using w, the classifier trained on asked data, we predict the labels for the validation set, naely V asked. Then, we copute the accuracy of w as the fraction of cases where V asked atches L true. Setting: We consider the regularization paraeter λ =. Moreover, to evaluate the effectiveness of the proposed ethod and the input perturbation ethod when the nuber of training saples increases, we consider two cases: N = 00 and N = 00. We vary the value of ɛ in the set {0., 0.,,,,..., 0, 50}, e.g., log-scale. For each value of ɛ, we generate 50 training datasets, run the proposed asked data generation Algorith and the input perturbation Algorith on each dataset, then report the ean and standard deviation accuracy of both algoriths. We also evaluate the accuracy using the classifier after adding Laplace noise, i.e., after Step of Algorith, which is naed as output perturbation. Analysis: As shown in Fig. 5, first, as ɛ increases, the accuracy of both ethods increase. Additionally, for a particular value of ɛ, the proposed ethod works better than input perturbation algorith. Moreover, as N increases fro 00 to 00, the proposed ethod gets higher accuracy for the sae value of ɛ. In contrast, the accuracy of the input perturbation ethod does not change uch as N increases. Furtherore, note that the input perturbation ethod only updates the data independently fro the achine learning odel. In contrast, the data generated by the proposed ethod is directly tied to the odel, e.g., logistic regression with a particular value of λ, which ay lead to higher accuracy. Moreover, the perforance of the classifier trained on asked saples is coparable to those of the classifier trained on original training saples then adding Laplace noise, i.e., after Step of Algorith. The results indicate that the proposed asked data generation Algorith is able to create asked saples with corresponding classifier close to the perturbed classifier. Conclusions In this paper, we proposed a data asking technique for privacy-sensitive learning. The ain idea is to iteratively find asked data such that the gradient of the likelihood on the classifier with regarding to the asked data is zero. Our theoretical analysis showed that the proposed technique achieves higher utility copared to a traditional input perturbation technique. Experients on ultiple real-world datasets also deonstrated the effectiveness of the proposed ethod. Appendices Proof for Proposition. Assue there are two training datasets D = {x i } N i= and D = {x i } N i=, which are different at only one saple. Without the loss of generality, we assue x i = x i for i = {,,..., N }, and x N x N. Assue the outputs of Algorith is O = {x, x,..., x p(o D) N }. Consider the ratio p(o D ). We assue that in Step 3 we can find the output O such that the gradient of logistic regression objective w.r.t. w is exactly 0. For the classifier in Step, we consider w = a for the first dataset D and w = a for the second dataset D. Using the fact that the log-likelihood of logistic regression is convex, and a and a are both optial classifiers of the published data O, thus a = a = a. Then, the ratio p(o D) p(o D ) is coputed as: p(o D ) p(o D ) = p(o w = a)p(w = a D ) p(o w = a)p(w = a D ) = p(w = a D ) p(w = a D ). Assue w = b and w = b are the optial classifiers for D and D after Step. Therefore, because of Laplace noise in Step, b + η = b + η = a p(w =a D ) λnɛ = e ( η η ). Consequently, p(w =a D = p(η=η) ) p(η=η ) p(o D ) p(o D e λnɛ ( b b ) ) e λnɛ ( b b ). The sensitivity of logistic regression with N saples and regularization paraeter λ is atost λn (Chaudhuri and Monteleoni 009) p(o D) p(o D ) eɛ, which copletes the proof. Proof for Proposition. Since w is achieved fro w by adding Laplace noise, w w is bounded. So, L λ (w ) L λ (w) is bounded using Taylor series. The rest of the proof follows fro Lea in (Chaudhuri and Monteleoni 009). Proof for Proposition 3. Assue there are two training datasets D = {x i } N i= and D = {x i } N i=, which are different at only one saple, e.g., without the loss of generality, we assue x i = x i for i = {,,..., N }, and x N x N. Assue the outputs of of Algorith is

7 (a) Adult incoe (b) Geran credit 0. (c) Age of Abalone. (d) Wave for (e) AUS credit 0. (f) Breast cancer (g) Blood transfusion 5 (h) Heart disease 5 (k) Diabetics (i) Maographic (j) SpliceDNA 5 () Iage Figure 5: The accuracy and privacy (ɛ in log-scale) trade off for benchark datasets.

8 O = {x, x,..., x N, x N }. Consider the ratio p(o D ) p(o D ) = p(x x )... p(x N x N )p(x N x N) p(x x )... p(x N x N )p(x N x N) = e ɛ x N x N e ɛ x N x N eɛ x N x N e ɛ, where the last equation is fro the fact that x i, i. Thus, the input perturbation algorith is ɛ-private. Proof for Proposition 5. The proof is siilar to (Chaudhuri and Monteleoni 009). For the sake of copleteness, following Lea 4, define G(w) = H (w) + λ w and g = H (w) H (w), where H (w) = [ N (yi ) N i= w T x ] C i log( x l= ewt i ) and H (w) = [ N (yi ) N i= w T ] C x i log( x i l= ewt ). [ N ( Then, g = N i= (y i p(y i = ) x i, w) x i + N i= ( ] (y i p(y i = x i, w) )x i N N i= x i x i (Nd log d δ )/(Nɛ) = d log d δ /ɛ, where the last inequality coes fro the fact x i x i e ɛ η and x i, x i R d. Note that even though x i is upper bounded by i, x i is not upper bounded by since x i = x i + η where η e ɛ η. Hence, g can not be trivially upper bounded by. Moreover, v T (G + g)v is lower bounded by λ. Thus, w w d log d δ λɛ. By Taylor expansion, L λ (w ) = L λ (w) + L λ (w)(w w) + (w w) T L λ (w)(w w) L λ (w) + w w (λ + ). This copletes the proof. Acknowledgents This aterial is based upon work supported by the National Science Foundation under Grant CNS References [Bartlett, Jordan, and McAuliffe 006] Bartlett, P. L.; Jordan, M. I.; and McAuliffe, J. D Convexity, classification, and risk bounds. Journal of the Aerican Statistical Association 0(473): [Blu, Ligett, and Roth 008] Blu, A.; Ligett, K.; and Roth, A A learning theory approach to noninteractive database privacy. In Proceedings of the fortieth annual ACM syposiu on Theory of coputing, [Chaudhuri and Monteleoni 009] Chaudhuri, K., and Monteleoni, C Privacy-preserving logistic regression. In Advances in neural inforation processing systes, [Chen et al. 0] Chen, R.; Mohaed, N.; Fung, B. C.; Desai, B. C.; and Xiong, L. 0. Publishing set-valued data via differential privacy. In Proceedings of the International Conference on Very Large Data Bases, nuber, [Diitrakakis et al. 04] Diitrakakis, C.; Nelson, B.; Mitrokotsa, A.; and Rubinstein, B. 04. Robust and private bayesian inference. In Proceedings of the International Conference on Algorithic Learning Theory, [Dwork, Roth, and others 04] Dwork, C.; Roth, A.; et al. 04. The algorithic foundations of differential privacy. Foundations and Trends R in Theoretical Coputer Science 407. [Dwork 008] Dwork, C Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Coputation, 9. Springer. [Fung et al. 00] Fung, B.; Wang, K.; Chen, R.; and Yu, P. S. 00. Privacy-preserving data publishing: A survey of recent developents. ACM Coputing Surveys (CSUR) 4(4):4. [Lee and Clifton 04] Lee, J., and Clifton, C. W. 04. Topk frequent itesets via differentially private fp-trees. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, [Lyu, Su, and Li 07] Lyu, M.; Su, D.; and Li, N. 07. Understanding the sparse vector technique for differential privacy. In Proceedings of the International Conference on Very Large Data Bases, [Minka 003] Minka, T. P A coparison of nuerical optiizers for logistic regression. [Mivule 0] Mivule, K. 0. Utilizing noise addition for data privacy, an overview. In Proceedings of the International Conference on Inforation and Knowledge Engineering (IKE),. [Mohaed et al. 0] Mohaed, N.; Chen, R.; Fung, B.; and Yu, P. S. 0. Differentially private data release for data ining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, [Saarati and Sweeney 998] Saarati, P., and Sweeney, L Generalizing data to provide anonyity when disclosing inforation. In PODS, 88. [Sarwate and Chaudhuri 03] Sarwate, A. D., and Chaudhuri, K. 03. Signal processing and achine learning with differential privacy: Algoriths and challenges for continuous data. IEEE signal processing agazine 30(5): [Vapnik and Vapnik 998] Vapnik, V. N., and Vapnik, V Statistical learning theory, volue. Wiley New York. [Walker and Duncan 967] Walker, S. H., and Duncan, D. B Estiation of the probability of an event as a function of several independent variables. Bioetrika 54: [Wang, Fienberg, and Sola 05] Wang, Y.-X.; Fienberg, S.; and Sola, A. 05. Privacy for free: Posterior sapling and stochastic gradient onte carlo. In Proceedings of the International Conference on Machine Learning, [Xiao, Xiong, and Yuan 00] Xiao, Y.; Xiong, L.; and Yuan, C. 00. Differentially private data release through ultidiensional partitioning. Secure Data Manageent 6358:50 68.

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

E. Alpaydın AERFAISS

E. Alpaydın AERFAISS E. Alpaydın AERFAISS 00 Introduction Questions: Is the error rate of y classifier less than %? Is k-nn ore accurate than MLP? Does having PCA before iprove accuracy? Which kernel leads to highest accuracy

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Domain-Adversarial Neural Networks

Domain-Adversarial Neural Networks Doain-Adversarial Neural Networks Hana Ajakan, Pascal Gerain 2, Hugo Larochelle 3, François Laviolette 2, Mario Marchand 2,2 Départeent d inforatique et de génie logiciel, Université Laval, Québec, Canada

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer. UIVRSITY OF TRTO DIPARTITO DI IGGRIA SCIZA DLL IFORAZIO 3823 Povo Trento (Italy) Via Soarive 4 http://www.disi.unitn.it O TH US OF SV FOR LCTROAGTIC SUBSURFAC SSIG A. Boni. Conci A. assa and S. Piffer

More information

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES ICONIC 2007 St. Louis, O, USA June 27-29, 2007 HIGH RESOLUTION NEAR-FIELD ULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR ACHINES A. Randazzo,. A. Abou-Khousa 2,.Pastorino, and R. Zoughi

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Asynchronous Gossip Algorithms for Stochastic Optimization

Asynchronous Gossip Algorithms for Stochastic Optimization Asynchronous Gossip Algoriths for Stochastic Optiization S. Sundhar Ra ECE Dept. University of Illinois Urbana, IL 680 ssrini@illinois.edu A. Nedić IESE Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Multiple Instance Learning with Query Bags

Multiple Instance Learning with Query Bags Multiple Instance Learning with Query Bags Boris Babenko UC San Diego bbabenko@cs.ucsd.edu Piotr Dollár California Institute of Technology pdollar@caltech.edu Serge Belongie UC San Diego sjb@cs.ucsd.edu

More information

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Two-Diensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Iage Classification Guo-Jun Qi, Xian-Sheng Hua, Meber,

More information

Statistical Logic Cell Delay Analysis Using a Current-based Model

Statistical Logic Cell Delay Analysis Using a Current-based Model Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,

More information

Fundamental Limits of Database Alignment

Fundamental Limits of Database Alignment Fundaental Liits of Database Alignent Daniel Cullina Dept of Electrical Engineering Princeton University dcullina@princetonedu Prateek Mittal Dept of Electrical Engineering Princeton University pittal@princetonedu

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

arxiv: v3 [cs.lg] 7 Jan 2016

arxiv: v3 [cs.lg] 7 Jan 2016 Efficient and Parsionious Agnostic Active Learning Tzu-Kuo Huang Alekh Agarwal Daniel J. Hsu tkhuang@icrosoft.co alekha@icrosoft.co djhsu@cs.colubia.edu John Langford Robert E. Schapire jcl@icrosoft.co

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

Efficient Learning with Partially Observed Attributes

Efficient Learning with Partially Observed Attributes Nicolò Cesa-Bianchi DSI, Università degli Studi di Milano, Italy Shai Shalev-Shwartz The Hebrew University, Jerusale, Israel Ohad Shair The Hebrew University, Jerusale, Israel Abstract We describe and

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Synthetic Generation of Local Minima and Saddle Points for Neural Networks

Synthetic Generation of Local Minima and Saddle Points for Neural Networks uigi Malagò 1 Diitri Marinelli 1 Abstract In this work-in-progress paper, we study the landscape of the epirical loss function of a feed-forward neural network, fro the perspective of the existence and

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Pattern Classification using Simplified Neural Networks with Pruning Algorithm

Pattern Classification using Simplified Neural Networks with Pruning Algorithm Pattern Classification using Siplified Neural Networks with Pruning Algorith S. M. Karuzzaan 1 Ahed Ryadh Hasan 2 Abstract: In recent years, any neural network odels have been proposed for pattern classification,

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Constrained Consensus and Optimization in Multi-Agent Networks arxiv: v2 [math.oc] 17 Dec 2008

Constrained Consensus and Optimization in Multi-Agent Networks arxiv: v2 [math.oc] 17 Dec 2008 LIDS Report 2779 1 Constrained Consensus and Optiization in Multi-Agent Networks arxiv:0802.3922v2 [ath.oc] 17 Dec 2008 Angelia Nedić, Asuan Ozdaglar, and Pablo A. Parrilo February 15, 2013 Abstract We

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Machine Learning: Fisher s Linear Discriminant. Lecture 05

Machine Learning: Fisher s Linear Discriminant. Lecture 05 Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information