This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px
Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Transcription

1 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

2 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) jo ur nal ho me p ag e: h.com/journals/cmpb Privacy-preserving Kruskal Wallis test Suxin Guo a,, Sheng Zhong b, Aidong Zhang a a Department of Computer Science and Engineering, SUNY at Buffalo, United States b State Key Laboratory for Novel Software Technology, Nanjing University, China a r t i c l e i n f o Article history: Received 4 January 2013 Received in revised form 17 May 2013 Accepted 28 May 2013 Keywords: Data security Statistical test Kruskal Wallis test a b s t r a c t Statistical tests are powerful tools for data analysis. Kruskal Wallis test is a non-parametric statistical test that evaluates whether two or more samples are drawn from the same distribution. It is commonly used in various areas. But sometimes, the use of the method is impeded by privacy issues raised in fields such as biomedical research and clinical data analysis because of the confidential information contained in the data. In this work, we give a privacy-preserving solution for the Kruskal Wallis test which enables two or more parties to coordinately perform the test on the union of their data without compromising their data privacy. To the best of our knowledge, this is the first work that solves the privacy issues in the use of the Kruskal Wallis test on distributed data Elsevier Ireland Ltd. All rights reserved. 1. Introduction Statistical hypothesis tests are very widely used for data analysis. Some popular statistical tests include t-test [1], ANOVA [2], Kruskal Wallis test [3], and Wilcoxon rank sum test [4]. Although these four are different tests, they serve the same goal, which is to find out whether the samples come from the same population. The t-test and ANOVA are parametric tests and assume the normal distribution of data. The non-parametric equivalence of these two tests are the Wilcoxon rank sum test, which is also known as Mann- Whitney U test [5], and Kruskal Wallis test, respectively. They do not assume the data to be normally distributed. The t-test can only deal with the comparison between two samples, and the ANOVA extends it to multiple samples. Similarly, the Kruskal Wallis is also a generalization of the Wilcoxon rank sum test from two samples to multiple samples. As stated above, the four tests are doing similar things under different assumptions. The non-parametric tests perform better when the data is not normally distributed, and are suitable especially in the cases when the data size is small (<25 per sample group) [6]. Although the Kruskal Wallis test is a helpful tool in many areas, sometimes the use of it is impeded by privacy concerns due to the confidential information in the data, especially in the clinical and biomedical research. For example, some hospitals conducted a study and tested the INR (International Normalized Ratio) values for their patients so that each hospital holds a set of INR values. The hospitals want to perform the Kruskal Wallis test to check whether their values are following the same trend. In this case, the set of the INR values of each hospital is treated as a sample. To conduct the Kruskal Wallis test, all samples should be known, which means, the hospitals have to share their data with each other. The problem is that it might be improper for the hospitals to share their samples because the data contains the private information of patients. Currently there is no method that enables the conduction of the Kruskal Wallis test on such distributed data with privacy concerns. Corresponding author. Tel.: /$ see front matter 2013 Elsevier Ireland Ltd. All rights reserved.

3 136 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) To solve this problem, we propose a privacy-preserving algorithm that allows the Kruskal Wallis test to be applied on samples distributed in different parties without revealing each party s private information to others. Due to the similarity in non-parametric tests, our method can also help the design of privacy-preserving solutions for other nonparametric tests. For example, the Wilcoxon rank sum test and the Kruskal Wallis test are used in the situations of two samples and two or more samples, respectively, and are essentially the same in the two samples case [3]. So our algorithm also solves the privacy issue of the Wilcoxon rank sum test to some extent. The rest of this paper is organized as follows: In Section 2, we present the related work. Section 3 provides the technical preliminaries including the background knowledge about the Kruskal Wallis test and the cryptographic tools we need. We propose the basic algorithm and the complete algorithm in Sections 4 and 5, respectively. The basic algorithm shows the procedure of conducting the Kruskal Wallis test securely when there is no tie in the data. The complete algorithm follows the basic algorithm and takes the existence of ties into consideration. In Section 6, we present the experimental results and finally, Section 7 concludes the paper. 2. Related work In recent years, due to the increasing awareness of privacy problems, a lot of data analyzing methods have been enhanced to be privacy-preserving, including many popular data mining and machine learning algorithms. Most of these approaches can be divided into two categories. Approaches in the first category protect data privacy with data perturbation techniques, such as randomization [7,8],rotation [9] and resampling [10]. Since the original data is changed, these approaches usually lose some accuracy. The methods in the second category are generally based on the Secure Multiparty Computation (SMC) and apply cryptographic techniques to protect data during the computations [11,12]. Such methods usually cause no accuracy loss but have higher computational cost. In our case, since the Kruskal Wallis test is often used on small sized data, we choose the second way, which is to protect privacy with cryptographic tools. It enables us to achieve higher accuracy with an affordable computational cost. In the cryptographic category, some SMC tools are very commonly used, such as secure sum [13], secure comparison [14,15], secure division [16], secure scalar product [13,16,17], secure matrix multiplication [18 20], and secure set operations [13]. Many data mining and machine learning algorithms have been extended with privacy solutions, such as decision tree classification [11,21], k-means clustering [22,23], gradient descent methods [24], but only a few works have been proposed to study the privacy issues in statistical tests. [25] gives a privacy-preserving algorithm to compare survival curves with the logrank test. [26] presents a privacy-preserving solution to perform the permutation test securely on distributed data. There is no work studies the privacy issues of the Kruskal Wallis test on distributed data. To the best of our knowledge, our work is the first one. 3. Technical preliminaries 3.1. The Kruskal Wallis test We first review the Kruskal Wallis test in this section. The test as proposed by Kruskal and Wallis [3] evaluates whether two or more samples are from the same distribution. The null hypothesis is that all the samples come from the same distribution. Suppose we have k samples, each contains a set of values. To perform the Kruskal Wallis test, we need to first rank all the values together without considering which sample the values belong to, then compute the sum of all the ranks of values within every sample, so that each sample has its sum of ranks. If there is no tie in all the values, the test statistic is: H = 12 N(N + 1) k i=1 R 2 i n i 3(N + 1), (1) where N is the total number of values in all samples; n i is the number of values contained in the ith sample, and R i is the sum of ranks in ith sample. After the calculation of H, we compare it to a value 2 :k 1 which can be found in a table of the chi-squared probability distribution with k 1 as the degrees of freedom and as the desired significance. If H 2 :k 1, the hypothesis is rejected. Otherwise, the hypothesis is accepted. If there are ties in the values, the calculation of the test statistic should be changed slightly. First, when ranking all the values, the ranks of each group of tied values are given as the average of the ranks that these tied values would have received without ties. For example, suppose we have values {1, 3, 3, 5} with one tie of two 3 s. Without considering the tie, their ranks should be 1, 2, 3, 4, respectively. After we change the ranks of the tied values to the average of them, their ranks become 1, 2.5, 2.5, 4. Then we can compute H with these new ranks. Besides the adjustment of ranks, we also need to divide H by: g i=1 C = 1 (t3 t i i ) N 3 N, (2) where g is the number of groups of tied values, and t i is the number of tied values in the ith group. For the above example {1, 3, 3, 5}, we have only 1 group of 2 tied values, so g = 1 and t 1 = 2. To sum up for the case with existence of ties, we need to adjust the ranks of tied values, and the test statistic is: H c = H C. (3) Actually Eq. (3) is the general solution that holds no matter there are ties or not. If there is no tie, C = 1 and thus, H c = H.

4 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) Privacy protection of the Kruskal Wallis test Like the hospital example mentioned in the introduction, we assume that each party has a sample and they hope to conduct a Kruskal Wallis test jointly to find out whether their samples follow the same distribution without revealing their data to others. Here our solution is based on the semi-honest model, which is widely used in the cryptographic category of privacypreserving methods [27,11,28,29,24,13,16,30]. In this model, all parties strictly follow the protocol, but can attempt to derive the private information of other parties with the intermediate results they get during the execution of protocols Cryptographic tools Homomorphic cryptographic scheme An additive homomorphic asymmetric cryptographic system is used to encrypt and decrypt the data in our work. A cryptographic scheme that encrypts integer x as E(x) is additive homomorphic if there are operators and that for any two integers x 1, x 2 and a constant a, we have E(x 1 + x 2 ) = E(x 1 ) E(x 2 ), E(a x 1 ) = a E(x 1 ). This means, with an additive homomorphic cryptographic system, we can compute the encrypted sum of integers directly from their encryptions. We do not need to decrypt the integers and compute the sum. In an asymmetric cryptographic system, we have a pair of keys: a public key for encryption and a private key for decryption Elgamal cryptographic system There are several additive homomorphic cryptographic schemes [30,31]. In this work, we apply a variant of ElGamal scheme [32], which is semantically secure under the Diffe- Hellman Assumption [33]. Elgamal Cryptographic system is a multiplicative homomorphic asymmetric cryptographic system. With this system, the encryption of a cleartext m is such a pair: E(m) = (m y r, g r ), where g is a generator, x is the private key, y is the public key that y = g x and r is a random integer. We call the first part of the pair c 1 and the second part c 2. c 1 = m y r and c 2 = g r. To decrypt E(m), we compute s = c x 2 = g rx = g xr = y r. Then do c 1 s 1 = m y r y r and we can get the cleartext m. In the variant of Elgamal scheme we use, the cleartext m is encrypted in such a way: E(m) = (g m y r, g r ). The only difference between the original Elgomal scheme and this variant is that m in the first part is changed to g m. With this operation, this variant is an additive homomorphic cryptosystem such that: E(x 1 + x 2 ) = E(x 1 ) E(x 2 ), E(a x 1 ) = E(x 1 ) a. To decrypt E(m), we follow the same procedure as in the original Elgamal algorithm. But this time, after the above calculations, we obtain g m instead of m. To get m from g m, we need to perform exhaustive search, which is to try every possible m and look for the one that matches g m. Please note that this exhaustive search is limited to a small range of possible plaintexts only, so the time needed is reasonable. In our work, the private key is shared by all the parties and no party knows the complete private key. The parties need to coordinate with each other to do the decryptions and the ciphertexts can be exposed to any party, because no party can decrypt them without the help of others. The private key is shared in this way: Suppose there are two parties, parties A and B. A holds a part of private key, x 1 and B holds the other part, x 2 such that x 1 + x 2 = x, where x is the complete private key. In the decryption, we need to compute s = c x 2 = c x 1+x 2 2 = c x 1 2 c x 2 2. Party A computes s 1 = c x 1 2 and party B computes s 2 = c x 2 2. s = s 1 s 2. We need to do c 1 s 1 = c 1 (s 1 s 2 ) 1 = c 1 s 1 1 s 1 2. Party A computes c 1 s 1 1 and sends it to party B. Then party B computes c 1 s 1 1 s 1 2 = c 1 s 1 = g m and sends it to A. In this way both parties can get the decrypted result. Here since the party B does the decryption later, it gets the final result earlier. If it does not send the result to A, the decrypted result can only be known to party B. The sequence of the parties can be changed, so if we need the result to be known to only one party, the party should do the decryption later Secure comparison We apply the secure comparison protocol proposed in [15] to compare two values from different parties securely. The input of this algorithm are two integers a and b which are from different parties. The output is an encryption of 1 if a > b, or an encryption of 0 otherwise. The basic idea of the secure comparison algorithm is as follows. Let the binary presentation of a and b be a l,..., a 1 and b l,..., b 1, where a 1 and b 1 are the least significant bits. If a > b, there is a pivot bit i such that b i a i + 1 =0 and a j XOR b j = a j + b j 2a j b j = 0 for every i < j l. This method applies the homomorphic encryptions to check if the pivot bit exists. This method can find out if a > b, but it cannot find out if a b directly. So when we want to know if a b, we compare 2a + 1 and 2b instead of a and b. If 2a + 1 >2b, since both a and b are integers, we can derive that a b. 4. The basic algorithm of privacy-preserving Kruskal Wallis test In this part, we present the basic algorithm for computing the H statistic of the Kruskal Wallis test securely without considering the existence of ties. The complete algorithm that also deals with ties will be discussed in the next section. To make the presentation clear, we first give the algorithm for performing the test within two parties, then extend it to the multiparty case.

5 138 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) Suppose there are two parties, A and B. Party A has sample S 1 which contains n 1 values, and party B has sample S 2 that contains n 2 values. The total number of values N = n 1 + n 2. The basic structure of the algorithm goes as follows: 1. For each value in each party, count how many values in its own party (including itself) are smaller than or equal to it. Encrypt these counts. 2. For each value in each party, compare it with all the values in the other party using the secure comparison algorithm. Then by adding the comparison results up, count how many values in the other party are smaller than or equal to it. Since the results of the secure comparison algorithm are in cipher text, these counts are also in cipher text. 3. For each value in each party, add the above two counts securely so we can get the total number of values in both parties that are smaller than or equal to it, which is the rank of it in cipher text. Then for each party, add all the encrypted ranks of its values and this is the encrypted rank sum of this party. Call the rank sums of the two parties R 1 and R 2, respectively. 4. With the encrypted rank sum of both parties, compute the H statistic with Eq. (1). Here comes a problem: to calculate H, we need the squared rank sum of both parties, R 2 1 and R 2 2. Since we only have the encrypted rank sums of the two parties E(R 1 ) and E(R 2 ), we have to compute E(R 2 1 ) and E(R 2 2 ) from E(R 1 ) and E(R 2 ). This is not easy because we are using an additive homomorphic system, which does not support the direct multiplication of two encrypted integers. So we need to develop an algorithm to solve it. Let us explain each step in details Secure computation of the rank sums To compute the rank of one value, we just need to count how many values in both parties are smaller than or equal to it. For example, with values {5, 6, 7}, the rank of value 5 is 1, because only 1 value is smaller than or equal to it, which is itself (5 5). The rank of 6 is 2 since there are 2 values smaller than or equal to it (5 6 and 6 6). Similarly, the rank of 7 is 3. For each value in each party, to count how many values are smaller than or equal to it in its own party is quite simple. We compare it with all values in its party, which can be easily done. But to count the number of smaller or equal values in the other party is not that straightforward. We also need to compare the value with all values in the other party, and the comparisons should be conducted with the secure comparison algorithm. Suppose the values in party A are a 1, a 2,..., a n1, and the values in party B are b 1, b 2,..., b n2. For each value a i (i = 1, b,..., n 1 ), we need to compare it with every value in party B with the secure comparison protocol. After these n 2 secure comparisons, we have n 2 results, and each of them is an encryption of 0 or 1 (E(0) or E(1)). For each value b j (j = 1, 2,..., n 2 ), the comparison between a i and b j is E(1) if a i b j and E(0) otherwise. Since the results are in cipher text, no party knows what they are. The sum of the n 2 results is the encrypted number of values that are smaller than or equal to a i in party B. We call it E(R B (a i )). The number of values that are smaller than or equal to a i in party A can be easily computed. It is named R A (a i ). We encrypt it and get E(R A (a i )). The sum of R A (a i ) and R B (a i ) is the rank of a i, which is R(a i ). The encryption of this rank E(R(a i )) can be computed from E(R A (a i )) and E(R B (a i )) with the additive homomorphic system that we utilize. In this way, we can get the encryptions of the ranks of all values from both parties: E(R(a 1 )), E(R(a 2 )),..., E(R(a n1 )), E(R(b 1 )), E(R(b 2 )),..., E(R(b n2 )). Then E(R 1 ) and E(R 2 ), which are the encryptions of the rank sums of party A and B, respectively, can be computed from them because R 1 = R(a 1 ) + R(a 2 ) + + R(a n1 ) and R 2 = R(b 1 ) + R(b 2 ) + + R(b n2 ) Secure computation of the squared rank sums We need to compute E(R 2 1 ) and E(R 2 2 ) from E(R 1 ) and E(R 2 ). Since the additive homomorphic cryptosystem does not support the direct multiplication of two encrypted integers, here we present an algorithm to solve it. To compute E(ab) from E(a) and E(b) that are known to both parties, first we need to make one of the integers additively shared by the two parties. For example, we make a additively shared by the two parties such that party A holds an integer a A and party B holds an integer a B that a A + a B = a. a A and a B can be got from E(a) in this way: Party A randomly generates an integer a A, and computes E(a A ). Then E(a a A ) = E(a B ) can be computed from E(a) and E(a A ) by party A. A sends it to party B and the two parties coordinate with each other to decrypt E(a B ). During the decryption, we make sure that the decryption result a B is only known to party B. This can be achieved with the cryptographic system that we use, as explained in Section 3. After A gets a A and B gets a B, the two parties A and B can compute E(a A b) and E(a B b), respectively. This can be done with the additive homomorphic system from a A, a B and E(b) because a A and a B are both integers in plaintext. What we want is E(ab) = E((a A + a B ) b) = E(a A b + a B b). Since E(a A b) is held by party A and E(a B b) is held by party B, the two parties should exchange their values so that both of them can compute the final result E(ab). But exchanging the values directly may cause privacy loss. For example, if party A gives E(a A b) to party B, since E(a A b) = E(b) a A with the variant of Elgamal system we use, and E(b) is known to party B, party B can derive some information about a A from E(a A b). So before the two parties calculate E(a A b) and E(a B b) and exchange their values, they do rerandomizations to their E(b)s. With the rerandomizations, the random numbers r that are used in the encryptions are changed, so the encryptions are different from the original ones. To make the presentation clear, we call the rerandomized E(b)s as E (b) and E (b) in party A and party B, respectively. Then parties A and B can calculate E (a A b) = E (b) a A and E (a B b) = E (b) a B, respectively, and exchange their values E (a A b) and E (a B b). Since the encryptions are changed, the parties cannot derive information from the value they get from each other. For example,

6 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) although party B gets E (a A b) from A, E (a A b) = E (b) a A and party B does not know E (b) because it is the rerandomization done by party A. So B cannot derive a A. After the exchange, party A has E(a A b) and E (a B b) and party B has E (a A b) and E(a B b). They can compute E(ab) = E(a A b + a B b) by themselves. The rerandomizations do not affect the calculations of the encrypted sums. In this way, both parties can get E(ab) from E(a) and E(b). Algorithm 1 shows the main procedure of this encrypted multiplication. Algorithm 1. Encrypted multiplication of two integers Input. Encryptions of integers a and b, E(a) and E(b) that are known to both parties; Output. The encryption of a b, E(ab); 1: Party A generates a random integer a A and computes E(a A ); 2: Party A computes E(a a A ) and sends it to party B; 3: The two parties coordinately decrypt E(a a A ) and only party B gets the result a a A = a B ; 4: Parties A and B rerandomize E(b) and get E (b) and E (b), respectively; 5: Parties A and B calculate E (a A b) and E (a B b), respectively, and exchange the two values; 6: Parties A and B compute E(ab) = E(a A b + a B b) by themselves; 4.3. Secure computation of H With Algorithm 1 we can get E(R 2 1 ) and E(R 2 2 ) from E(R 1 ) and E(R 2 ). Because we assume there are two parties, the H statistic is calculated as: H = 12 N(N + 1) ( R2 1 n 1 + R2 2 n 2 ) 3(N + 1), where N, n 1 and n 2 are constants known to both parties. From E(R 2 1 ) and E(R 2 2 ), both parties can compute E(R 2 1 n 2 + R 2 2 n 1 ). They then coordinately decrypt it and get R 2 1 n 2 + R 2 2 n 1. The final result is calculated as: H = 12 N(N + 1)n 1 n 2 (R 2 1 n 2 + R 2 2 n 1 ) 3(N + 1). The reason why we compute R 2 1 n 2 + R 2 2 n 1 and then divide it with n 1 n 2 instead of compute R 2 1 /n 1 + R 2 2 /n 2 directly is that the cryptographic system we use only support the operations on non-negative integers. To avoid the decimal fractions in the encryptions, we compute R 1 2 n 2 + R 2 2 n 1 and after the decryption, the division is applied The summarized algorithm The main steps of the algorithm is summarized in Algorithm 2. Algorithm 2. The basic algorithm of privacy-preserving Kruskal Wallis test Input. Party A has sample S 1 which contains n 1 values, and party B has sample S 2 which contains n 2 values. The total number of values N = n 1 + n 2 ; Output. The statistic H; 1: for each value a i in party A do 2: Calculate the encrypted rank of it E(R(a i )); 3: end for 4: for each value b j in party B do 5: Calculate the encrypted rank of it E(R(b j )); 6: end for 7: Compute the encrypted rank sum of each party E(R 1 n ) and E(R 2 ) where R 1 = 1 i=1 R(a i) n and R 2 = 2 j=1 R(b j); 8: Calculate E(R1 2) and E(R2 2) from E(R 1 ) and E(R 2 ) with Algorithm 1; 9: Calculate E(R1 2 n 2 + R2 2 n 1 ) and decrypt it; 10: Compute H from R1 2 n 2 + R2 2 n 1 ; 4.5. Extension to multiparty The extension of the algorithm from two-party to multiparty is straightforward. For each value in each party, to get its rank in the two-party case, we count the number of values that are smaller than or equal to it in its own party and in the other party. To count the number in the other party, we need the secure comparison protocol. Similarly, in the multiparty case, we also count the number of values that are smaller than or equal to it in its own party and every other party with the help of the secure comparison protocol. After the computation of encrypted ranks for every value in every party, the encrypted rank sums are calculated, just like in the 2-party case. Then the encrypted squared rank sums E(R 2 1 ), E(R 2 2 ),..., E(R 2 k ) can be computed with Algorithm 1. They are known to all the parties. As we compute E(R 2 1 n 2 + R 2 2 n 1 ) when there are two parties, for the k parties, E(R 2 1 n 2 n 3... n k + R 2 2 n 1 n 3... n k + + R 2 k n 1 n 2... n k 1 ) is computed. We decrypt it and divide the decrypt result by n 1 n 2... n k instead of n 1 n 2 in the two-party case. Then the final result H is calculated. 5. The complete algorithm of privacy-preserving Kruskal Wallis test We present the privacy-preserving Kruskal Wallis test with considering ties in this section Modifying the data to eliminate ties Before we explain the complete algorithm, we give a simpler method to deal with the tied values. This is to modify the values slightly to eliminate the ties and then apply the basic algorithm to the modified data. Since the data is modified a little, this method causes slight accuracy loss.

7 140 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) To eliminate ties between parties, we do the following steps: If there are two parties, for every value in the first party, multiply it with 10 and then add 0 to it. For every value in the second party, multiply it with 10 and then add 1 to it. For example, suppose a i belongs to the first party and b j belongs to the second party. We do a i = a i and b j = b j In this way, the ties between the two parties are eliminated and the ranks of other values are not affected. If there are more than two parties, the data is modified similarly depending on the number of parties. For example, if there are ten parties, we still multiply every value in every party with 10 and add zeros to the values in the first party, add ones to the values in the second party,..., add nines to the tenth party. If there are 100 parties, multiply every value with 100 and add zeros to ninety-nines to the values of the first to 100th party, respectively. To deal with the ties within parties, we do not need to modify the data. We can ignore these ties when calculating the ranks. For example, suppose one party has three values, {1, 1, 1}. With our algorithm, the ranks are calculated by counting the number of smaller or equal values. For these three values, the number of smaller or equal values in their own parties are 3, 3 and 3. We change them to 1, 2 and 3, respectively. This can be easily finished because every party has the information of ties within it. After changing the local counts, the counts of smaller or equal values from other parties are added to get the rank. The ranks do not contain any tie because both ties within the local party and the ties between parties are disregarded. After the modifications, we can apply the basic algorithm that deals with data without ties The complete algorithm Here we present the complete algorithm that works for data containing ties. Similar to the previous section, the algorithm is proposed with assumption that there are only two parties and then extended to the multiparty case. As mentioned in Section 3, when there are ties in the data, the calculation of the statistic is changed in two aspects: The ranks of the tied values should be adjusted when computing H, and H should be divided by C. Both of them will be discussed in details Adjustment of the ranks of tied values The ranks of each group of tied values should be changed to the average of the ranks that these tied values would have received without ties. We use an example to show the basic idea to achieve this adjustment. Suppose there are values {1, 2, 3, 4, 4, 4, 4, 4} that are distributed in two samples held by two parties, respectively. Party A has sample S 1 which contains values {1, 2, 4, 4} and party B has sample S 2 which contains values {3, 4, 4, 4}. Without considering the tie, we know that the ranks of the values {1, 2, 3, 4, 4, 4, 4, 4} are 1, 2, 3, 4, 5, 6, 7, 8, respectively. The five 4 s are tied and their ranks are 4, 5, 6, 7, 8. The largest rank in this tie is 8 and the smallest rank is 4. The average of the ranks is 6 and it can be calculated by taking the average of the largest rank 8 and the smallest rank 4. This is because that the ranks of values in a tie is an arithmetic sequence, so the average of all values in the sequence is the same as the average of the smallest and the largest values. After changing the ranks of the tied values to the average of them, the ranks should be 1, 2, 3, 6, 6, 6, 6, 6. In our algorithm, since we calculate the rank of each value by counting the values that smaller than or equal to it, the ranks are 1, 2, 3, 8, 8, 8, 8, 8 because for each 4, there are 8 values smaller than or equal to it. So with our algorithm, the ranks of each group of tied values are actually the largest rank in the tie. We need to add some steps into our algorithm to change the ranks form 1, 2, 3, 8, 8, 8, 8, 8 to 1, 2, 3, 6, 6, 6, 6, 6. The basic idea is: Since the ranks of values in each tie is the largest rank in the tie, we only need to get the smallest rank in the tie and take the average of the largest rank and the smallest rank. To get the smallest rank from the largest rank, we need to know the number of values in the tie. With the largest rank named as l, the smallest rank named as s, and the number of values in the tie named as t, we have s = l t + 1. As in our example, the tie contains 5 values with the largest rank as 8 and the smallest rank as 4. We have = 4. So, to change the ranks form 1, 2, 3, 8, 8, 8, 8, 8 to 1, 2, 3, 6, 6, 6, 6, 6, we need to get the number of values in ties, and then compute the smallest ranks in ties, and take the average of the largest ranks and the smallest ranks. We assume that each value is in a tie and calculate the number of values in each value s tie. In our example, value 1 is in a tie that contains only 1 value, so are values 2 and 3. Each value 4 is in a tie that contains 5 values. So for values {1, 2, 3, 4, 4, 4, 4, 4}, we have 1, 1, 1, 5, 5, 5, 5, 5 as the number of values in each value s tie. Then for each value, compute the smallest rank in its tie with s = l t + 1. For value 1, the smallest rank is = 1. For value 2, the smallest rank is = 2. For value 3, the smallest rank is = 3. For each value 4, the smallest rank is = 4. So the smallest ranks for the eight values are 1, 2, 3, 4, 4, 4, 4, 4. With the largest ranks 1, 2, 3, 8, 8, 8, 8, 8, we can get the averaged ranks 1, 2, 3, 6, 6, 6, 6, 6. We can see that for values 1, 2 and 3 that are not tied, assuming that they are in ties containing 1 value does not affect the calculation results of their ranks. The reason why we make such assumption is that, although we show all the values, ranks and tied numbers of values together in cleartext to make it easier to understand, in the real settings, they are encrypted or distributed and no party has the complete information about them. So no party knows whether a value is in a tie or not. For example, party A has one value 1 and this value is not in a tie in party A. But A does not know whether party B has value 1 or not, and A does not know whether value 1 is in a tie globally. So all values are assumed to be in a tie. After explaining the basic idea of the adjustment of ranks, let us show the steps that the two parties do the adjustment securely. We follow the basic algorithm in Section 4 to get the ranks of each value in each party. Here the rank s are the number of smaller or equal values, which are the largest ranks of each tie. To count the smaller or equal values for value a i in party A, it is compared with both values in party A and party B. When comparing a i with values in party A, we also count the number of values that are equal to a i in party A and name it T A (a i ). As mentioned in Section 4, when comparing a i with every value in party B securely, each of the comparison result is an encryption of 0 or 1 such that if b j a i, the comparison result between b j and a i is E(1) and otherwise E(0). The sum of these results

8 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) Table 1 An example table. a 1... a i... a n1 b 1... b j... b n2 E(1) is the encrypted number of values smaller than or equal to a i in party B. Here we keep all the comparison results between every pair of a i and b j in a n 1 n 2 table such that the element in the table on the a i th row and b j th column is the comparison result between a i and b j, which is E(1) if b j a i and E(0) otherwise. Table 1 is an example with a n 1 n 2 table. Similarly, to count the smaller or equal values of value b j in party B, we compare it with values in both party A and party B. When comparing b j with values in party B, we also count the number of values that are equal to b j in party B and name it T B (b j ). When comparing b j with values in party A securely, each comparison result is not the same as the previous case. Here the comparison result between b j and a i is E(1) if a i b j and E(0) otherwise. We also keep the comparison results in a n 1 n 2 table. The two tables storing the comparison results are not the same. In the first table, the value in the a i th row and b j th column is E(1) if b j a i and E(0) otherwise; while in the second table, the value in the a i th row and b j th column is E(1) if b j a i and E(0) otherwise. Here we introduce a third n 1 n 2 table that each element in it is the secure sum of the two corresponding elements in the first and second tables. For example, if the value in the a i th row and b j th column in the first table is E(1) and in the second table is E(0), then the value in the a i th row and b j th column in the third table is E(1 + 0). The values in the third table is either E(1) or E(2). If a i < b j, the value in the second table is E(1) and the value in the first table is E(0). Thus, the value in the third table is E(1). If a i > b j, the value in the first table is E(1) and the value in the second table is E(0). Thus, the value in the third table is also E(1). If a i = b j, both values in the first and second tables are E(1) and the value in the third table is E(2). To sum up, the value in the a i th row and b j th column in the third table is E(1) if a i /= b j and E(2) if a i = b j. We securely deduct 1 from every element in the third table. Then the value in the a i th row and b j th column in the new table is E(0) if a i /= b j and E(1) if a i = b j. This new table contains the information of equal values between the two parties. The sum of all the values in the a i th row is the encrypted number of values that are equal to a i in party B which is named as E(T B (a i )). The sum of all the values in the b j th column is the encrypted number of values that are equal to b j in party A which is named as E(T A (b j )). Since parties A and B have computed T A (a i ) and T B (b j ), respectively, the two numbers can be encrypted and added to the E(T B (a i )) and E(T A (b j )), respectively to get E(T(a i )) = E(T A (a i ) + T B (a i )) and E(T(b j )) = E(T A (b j ) + T B (b j )). For each value a i (i = 1, 2,..., n 1 ) in party A, we have E(R(a i )) which is the encrypted largest rank in a i s tie and E(T(a i )) which is the encrypted number of values in a i s tie, or the number of values equal to a i in both parties. For each value b j (j = 1, 2,..., n 2 ) in party B, we have the similar numbers E(R(b j )) and E(T(b j )). To get the averaged rank for each value, we need to know the smallest rank in each value s tie. The smallest ranks can be calculated from the largest ranks and the numbers of values in ties. For each value a i (i = 1, 2,..., n 1 ) in party A, the encrypted smallest rank E(S(a i )) in a i s tie is E(R(a i ) T(a i ) + 1) and the encrypted adjusted rank of a i is E((S(a i ) + R(a i ))/2), which is the average between the largest and the smallest rank. To avoid the decimal fraction in the ciphertext, we only calculate E(S(a i ) + R(a i )) and the division by 2 is applied after the final decryption. For each value b j (j = 1, 2,..., n 2 ) in party B, the encrypted smallest rank E(S(b j )) in b j s tie is E(R(b j ) T(b j ) + 1) and the encrypted adjusted rank of b j is E((S(b j ) + R(b j ))/2). We also calculate E(S(b j ) + R(b j )) and apply the division by 2 after the final decryption. In this way, we can adjust the ranks of every value and the rank sums are calculated based on these new ranks. Please notice that if a value is not tied with others, the adjustment does not change its rank. The complete algorithm of calculating H is summarized in Algorithm 3. Algorithm 3. The complete algorithm of privacy-preserving Kruskal Wallis test Input. Party A has sample S 1 which contains n 1 values, and party B has sample S 2 which contains n 2 values. The total number of values N = n 1 + n 2 ; Output. The statistic H; 1: for each value a i in party A do 2: Calculate the encrypted rank of it E(R(a i )) and record the secure comparison results; 3: end for 4: for each value b j in party B do 5: Calculate the encrypted rank of it E(R(b j )) and record the secure comparison results; 6: end for 7: From the secure comparison results, get the information of equal values between the two parties; 8: for each value a i in party A do 9: Calculate the encrypted number of values equal to it E(T(a i )); 10: Calculate the encrypted smallest rank in its tie E(S(a i )) from E(T(a i )) and E(R(a i )); 11: Calculate the encrypted averaged rank of it; 12: end for 13: for each value b j in party B do 14: Calculate the encrypted number of values equal to it E(T(b j )); 15: Calculate the encrypted smallest rank in its tie E(S(b j )) from E(T(b j )) and E(R(b j )); 16: Calculate the encrypted averaged rank of it; 17: end for 18: Do the remaining calculations to compute H as in Algorithm 2 with the encrypted averaged ranks;

9 142 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) To extend the adjustment from two parties to multiple parties, we just need to create a table containing the information of equal values for each pair of parties during the computations of ranks. For each value, calculate the encrypted number of values equal to it by collecting information from all tables it is involved. Then the encrypted smallest rank in its tie and the averaged rank can be computed and the following steps are the same as in the extension of Algorithm 2 in Section Calculation of C In most cases, dividing H by C makes little change in the final result. If the number of tied values are not more than 1/4 of the total values, the division does not change the result by more than 10% for some degrees of freedom and significance [3]. To calculate C securely for two parties A and B, we need the information of ties computed in the adjustment of ranks, the E(T(a i )) for each value a i (i = 1, 2,..., n 1 ) in party A and the E(T(b j )) for each value b j (j = 1, 2,..., n 2 ) in party B. From Eq. (2), we have (t 3 t i i ) C = 1 N 3 N, where t i is the number of values in the ith tie. To compute C securely, we treat T(a i ) of each distinct a i and T(b j ) of each distinct b j as t i. For the values that are not tied with others, since their T values are equal to 1, and =0, adding them do not affect the value of C. For the tied values, their T values should be considered just once in the calculation of C, so we consider the T s of the distinct values in each party. With the example we used before that party A has values {1, 2, 4, 4} and party B has values {3, 4, 4, 4}, for party A, we only consider T(1) = 1, T(2) = 1 and T(4) = 5. For party B, we consider T(3) = 1 and T(4) = 5. Here all the T s are encrypted and no party knows the exact numbers. C can be securely computed from the encryption of t i s. The E(t 3 i ) is calculated from E(t i ) with Algorithm 1 and then E( t 3 t i i ) can be computed. The problem is, although only the T s of distinct values in each party are included in the calculation of C, there are still duplicates. Considering only the distinct values in each party can make sure that the ties within parties are counted only once, but it cannot eliminate the duplicated ties between parties. As in the above example, T(4) is counted twice because the tie of value 4 exists in both parties. We call the set of ties exist only in party A T A, the set of ties exist only in party B T B and the set of ties exist in both parties T AB. We want the information about T A, T B and T AB to be included in C just once. With the above solution, T AB is counted twice. If we consider only T(a i ) for each value a i (i = 1, 2,..., n 1 ) in party A and do not add the T(b j ) for each value b j (j = 1, 2,..., n 2 ) in party B, T A and T AB are considered once but the information of T B is lost. We cannot add the information of only T B without adding T AB, because every party does not know whether a tie in it is local or global. We haven t worked out a solution to calculate C exactly as it is. The two solutions mentioned above either add more tie information or lose some tie information when calculating C. But they can give a range of C by providing an upper bound and a lower bound and cut down the loss of accuracy. Table 2 The BMI dataset. Asians Indians Malays 32 (15) 26.4 (11) 24.9 (8) 30.1 (14) 23.1 (2) 25.3 (9) 27.6 (12) 23.5 (4) 23.8 (5) 26.2 (10) 24.6 (7) 22.1 (1) 28.2 (13) 24.3 (6) 23.4 (3) We use some examples to show the extension of the calculation of C from two parties to multiparty. Suppose there are three parties, A, B and C. Similar to the two-party case, we denote T A, T B and T C as the sets of ties exist only in party A, B and C, respectively. T AB is the set of ties in parties A and B. T AC, T BC and T ABC are defined in the same way. For each pair of parties, we have a table storing the information of tied values between the two parties. The three tables are named as Table(AB), Table(AC) and Table(BC), respectively. We collect the tie information of each distinct value in party A from all the tables that involve A, which are Table(AB) and Table(AC). This gives us the information about all the ties appear in party A, which are T A, T AB, T AC and T ABC. Then we disregard party A and the tables involving A, and collect the tie information of each distinct value in party B from all the remaining tables that involve B, which is only Table(BC). With this step, we can add the information about all the ties appearing in party B but not in A, which are T B and T BC. Then we encounter the same problem as in the two-party case: if we stop here, the tie information of T C is lost; if we add the tie information of each distinct value in party C from a table involving C such as Table(AC), both T C and T AC are added, and thus T AC is counted twice. When there are k parties, we follow the same procedure and get the information of all the ties appear in the first party, then add the information of ties in the second party, and so on. When it comes to the last party, we either lose the information of ties appearing only in the last party, or add duplicate information about ties appearing in both the last party and some other party. This gives us an upper bound and a lower bound of C. 6. Experiments The experimental results are presented in this section. All the algorithms are implemented with the Crypto++library in the C++language and the communications between parties are implemented with socket API. The experiments are conducted on a Red Hat server with GHz CPUs and 24 G of memory. We use the two datasets from [34] to test the accuracy of our algorithms. The first dataset, as shown in Table 2, contains 3 samples with equal sizes. The sample in the context of this paper is clearly different from that in many other papers. Each sample here is the set of data held by a party and the number of samples is the number of parties. In this dataset, the data are simulated Body Mass Index (BMI) values for subjects of 3 different races from a surburb of San Francisco. Here the BMI values for subjects of each race is a sample. There is no tie in this dataset and the rank of every value is given in parentheses.

10 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e ( ) Table 3 The INR dataset. Hospital A Hospital B Hospital C Hospital D 1.68 (1) 1.71 (6) 1.74 (13.5) 1.71 (6) 1.69 (2) 1.73 (10) 1.75 (16) 1.71 (6) 1.70 (3.5) 1.74 (13.5) 1.77 (18) 1.74 (13.5) 1.70 (3.5) 1.74 (13.5) 1.78 (20) 1.79 (22) 1.72 (8) 1.78 (20) 1.80 (23.5) 1.81 (26) 1.73 (10) 1.78 (20) 1.81 (26) 1.85 (29) 1.73 (10) 1.80 (23.5) 1.84 (28) 1.87 (30) 1.76 (17) 1.81 (26) 1.91 (31) The second dataset is presented in Table 3. It contains 4 samples and the sizes of them are not all equal. Each sample is a set of simulated International Normalized Ratio (INR) values of patients in one hospital. The ranks are given in parentheses. There are ties in the data and the tied ranks are bold. Since our secure algorithm only deal with non-negative integers, each value in dataset 1 is multiplied by 10 and each value in dataset 2 is multiplied by 100. This step changes all the values to non-negative integers without changing the ranks of values, and it does not affect the result of the Kruskal Wallis test which is calculated from the ranks. The accuracy of our basic algorithm for data without ties is 100%. This is shown with dataset 1. We provide both the H values calculated in two-party and multiparty scenarios in Table 4. In the two-party case, we take the first two samples of dataset 1 and calculate the H value on these two samples. In the multiparty case, the H value is calculated on all the three samples of dataset 1. Our algorithms for data with ties cause some accuracy loss. There are two methods to deal with tied values. The first one is to modify the data slightly to eliminate ties and then compute H with the basic algorithm. Accuracy loss occurs because the data is changed. The second method is to keep the data unchanged, but adjust the ranks and divide H by C. Here the accuracy loss comes from the calculation of C. Because we can compute an upper bound and a lower bound for C, we can also get an upper bound and a lower bound for the final result H c. We test the two methods with dataset 2 and the results are shown in Table 5. Here we also take the first two samples from dataset 2 to test the two-party case and all four samples of dataset 2 to test the multiparty case. As we can see in the result, the second method has better accuracy than the first one. In the case with two parties, although the first two samples of dataset 2 that we use contain a lot of ties (9 out of 16 values are in ties), the two bounds are both very close to the accurate result. In the multiparty case, both the upper and lower bounds are equal to the accurate result. This is because the two bounds are calculated by either disregarding the ties only in the last sample, or counting the ties between the last and the first samples twice. Fortunately, in this dataset, the last sample does not contain any tie that is only in it, and there is no tie between the last sample and the first sample. So with this dataset, the two bounds are equal to the accurate result. Let us show the computation overheads of the algorithms. In Fig. 1 we present the running time comparison between the algorithms we proposed with different sizes of data under the two-party scenario. The running time values are in seconds. We can find that the execution time of the basic algorithm for data without ties and the first method for data with ties are very close. This is because in the first method of dealing with ties, we eliminate the ties and then follow the same procedure as the basic algorithm. The second method for data with ties takes more time than the first one, mostly because that the adjustment of ranks takes time. We also show the overheads in the multiparty case with datasets 1 and 2. The execution time of the basic algorithm on dataset 1 is: Running time for 2 samples: 5 s Running time for 3 samples: 17 s The execution time of the first method for data containing ties on dataset 2 is: Running time for 2 samples: 15 s Running time for 3 samples: 67 s Running time for 4 samples: 599 s The execution time of the second method for data containing ties on dataset 2 is: Running time for 2 samples: 26 s Running time for 3 samples: 169 s Running time for 4 samples: 2159 s Table 5 Kruskal Wallis test result on data with ties. Table 4 Kruskal Wallis test result on data without ties. 2 samples 3 samples H calculated by the original Kruskal Wallis test H calculated by our basic algorithm H c calculated by the original Kruskal Wallis test H calculated from modified data (the first method) The upper bound of H c (the second method) The lower bound of H c (the second method) 2 samples 4 samples

A Privacy Preserving Markov Model for Sequence Classification

A Privacy Preserving Markov Model for Sequence Classification A Privacy Preserving Markov Model for Sequence Classification Suxin Guo Department of Computer Science and Engineering SUNY at Buffalo Buffalo 14260 U.S.A. suxinguo@buffalo.edu Sheng Zhong State Key Laboratory

More information

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 33 The Diffie-Hellman Problem

More information

Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection

Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection Suxin Guo, Sheng Zhong, and Aidong Zhang Department of Computer Science and Engineering, State University of New

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection

Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection Privacy Preserving Calculation of Fisher Criterion Score for Informative Gene Selection Suxin Guo 1, Sheng Zhong 2, and Aidong Zhang 1 1 Department of Computer Science and Engineering, SUNY at Buffalo,

More information

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data? Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research education use, including for instruction at the authors institution

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks

More information

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F. Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks

More information

Question: Total Points: Score:

Question: Total Points: Score: University of California, Irvine COMPSCI 134: Elements of Cryptography and Computer and Network Security Midterm Exam (Fall 2016) Duration: 90 minutes November 2, 2016, 7pm-8:30pm Name (First, Last): Please

More information

1 Secure two-party computation

1 Secure two-party computation CSCI 5440: Cryptography Lecture 7 The Chinese University of Hong Kong, Spring 2018 26 and 27 February 2018 In the first half of the course we covered the basic cryptographic primitives that enable secure

More information

Benny Pinkas Bar Ilan University

Benny Pinkas Bar Ilan University Winter School on Bar-Ilan University, Israel 30/1/2011-1/2/2011 Bar-Ilan University Benny Pinkas Bar Ilan University 1 Extending OT [IKNP] Is fully simulatable Depends on a non-standard security assumption

More information

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data? Data Analysis: Agonistic Display in Betta splendens By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

Lecture Notes, Week 6

Lecture Notes, Week 6 YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE CPSC 467b: Cryptography and Computer Security Week 6 (rev. 3) Professor M. J. Fischer February 15 & 17, 2005 1 RSA Security Lecture Notes, Week 6 Several

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Privacy-preserving Data Mining

Privacy-preserving Data Mining Privacy-preserving Data Mining What is [data] privacy? Privacy and Data Mining Privacy-preserving Data mining: main approaches Anonymization Obfuscation Cryptographic hiding Challenges Definition of privacy

More information

Privacy-Preserving Data Imputation

Privacy-Preserving Data Imputation Privacy-Preserving Data Imputation Geetha Jagannathan Stevens Institute of Technology Hoboken, NJ, 07030, USA gjaganna@cs.stevens.edu Rebecca N. Wright Stevens Institute of Technology Hoboken, NJ, 07030,

More information

8 Elliptic Curve Cryptography

8 Elliptic Curve Cryptography 8 Elliptic Curve Cryptography 8.1 Elliptic Curves over a Finite Field For the purposes of cryptography, we want to consider an elliptic curve defined over a finite field F p = Z/pZ for p a prime. Given

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Peter Schwabe October 21 and 28, 2011 So far we assumed that Alice and Bob both have some key, which nobody else has. How

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Public-Key Encryption: ElGamal, RSA, Rabin

Public-Key Encryption: ElGamal, RSA, Rabin Public-Key Encryption: ElGamal, RSA, Rabin Introduction to Modern Cryptography Benny Applebaum Tel-Aviv University Fall Semester, 2011 12 Public-Key Encryption Syntax Encryption algorithm: E. Decryption

More information

Leveraging Randomness in Structure to Enable Efficient Distributed Data Analytics

Leveraging Randomness in Structure to Enable Efficient Distributed Data Analytics Leveraging Randomness in Structure to Enable Efficient Distributed Data Analytics Jaideep Vaidya (jsvaidya@rbs.rutgers.edu) Joint work with Basit Shafiq, Wei Fan, Danish Mehmood, and David Lorenzi Distributed

More information

University of Regina Department of Mathematics & Statistics Final Examination (April 21, 2009)

University of Regina Department of Mathematics & Statistics Final Examination (April 21, 2009) Make sure that this examination has 10 numbered pages University of Regina Department of Mathematics & Statistics Final Examination 200910 (April 21, 2009) Mathematics 124 The Art and Science of Secret

More information

Lecture 1: Introduction to Public key cryptography

Lecture 1: Introduction to Public key cryptography Lecture 1: Introduction to Public key cryptography Thomas Johansson T. Johansson (Lund University) 1 / 44 Key distribution Symmetric key cryptography: Alice and Bob share a common secret key. Some means

More information

Cryptanalysis on An ElGamal-Like Cryptosystem for Encrypting Large Messages

Cryptanalysis on An ElGamal-Like Cryptosystem for Encrypting Large Messages Cryptanalysis on An ElGamal-Like Cryptosystem for Encrypting Large Messages MEI-NA WANG Institute for Information Industry Networks and Multimedia Institute TAIWAN, R.O.C. myrawang@iii.org.tw SUNG-MING

More information

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender,

More information

Public Key Cryptography

Public Key Cryptography Public Key Cryptography Introduction Public Key Cryptography Unlike symmetric key, there is no need for Alice and Bob to share a common secret Alice can convey her public key to Bob in a public communication:

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Non-parametric tests, part A:

Non-parametric tests, part A: Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

Cryptography Lecture 4 Block ciphers, DES, breaking DES

Cryptography Lecture 4 Block ciphers, DES, breaking DES Cryptography Lecture 4 Block ciphers, DES, breaking DES Breaking a cipher Eavesdropper recieves n cryptograms created from n plaintexts in sequence, using the same key Redundancy exists in the messages

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

Introduction to Modern Cryptography. Benny Chor

Introduction to Modern Cryptography. Benny Chor Introduction to Modern Cryptography Benny Chor RSA Public Key Encryption Factoring Algorithms Lecture 7 Tel-Aviv University Revised March 1st, 2008 Reminder: The Prime Number Theorem Let π(x) denote the

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines) Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

8.1 Principles of Public-Key Cryptosystems

8.1 Principles of Public-Key Cryptosystems Public-key cryptography is a radical departure from all that has gone before. Right up to modern times all cryptographic systems have been based on the elementary tools of substitution and permutation.

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 11 October 7, 2015 CPSC 467, Lecture 11 1/37 Digital Signature Algorithms Signatures from commutative cryptosystems Signatures from

More information

Real scripts backgrounder 3 - Polyalphabetic encipherment - XOR as a cipher - RSA algorithm. David Morgan

Real scripts backgrounder 3 - Polyalphabetic encipherment - XOR as a cipher - RSA algorithm. David Morgan Real scripts backgrounder 3 - Polyalphabetic encipherment - XOR as a cipher - RSA algorithm David Morgan XOR as a cipher Bit element encipherment elements are 0 and 1 use modulo-2 arithmetic Example: 1

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Multiparty Computation

Multiparty Computation Multiparty Computation Principle There is a (randomized) function f : ({0, 1} l ) n ({0, 1} l ) n. There are n parties, P 1,...,P n. Some of them may be adversarial. Two forms of adversarial behaviour:

More information

CPE 776:DATA SECURITY & CRYPTOGRAPHY. Some Number Theory and Classical Crypto Systems

CPE 776:DATA SECURITY & CRYPTOGRAPHY. Some Number Theory and Classical Crypto Systems CPE 776:DATA SECURITY & CRYPTOGRAPHY Some Number Theory and Classical Crypto Systems Dr. Lo ai Tawalbeh Computer Engineering Department Jordan University of Science and Technology Jordan Some Number Theory

More information

CPSC 467b: Cryptography and Computer Security

CPSC 467b: Cryptography and Computer Security CPSC 467b: Cryptography and Computer Security Michael J. Fischer Lecture 11 February 21, 2013 CPSC 467b, Lecture 11 1/27 Discrete Logarithm Diffie-Hellman Key Exchange ElGamal Key Agreement Primitive Roots

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

10 Public Key Cryptography : RSA

10 Public Key Cryptography : RSA 10 Public Key Cryptography : RSA 10.1 Introduction The idea behind a public-key system is that it might be possible to find a cryptosystem where it is computationally infeasible to determine d K even if

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Lecture 18 - Secret Sharing, Visual Cryptography, Distributed Signatures

Lecture 18 - Secret Sharing, Visual Cryptography, Distributed Signatures Lecture 18 - Secret Sharing, Visual Cryptography, Distributed Signatures Boaz Barak November 27, 2007 Quick review of homework 7 Existence of a CPA-secure public key encryption scheme such that oracle

More information

L7. Diffie-Hellman (Key Exchange) Protocol. Rocky K. C. Chang, 5 March 2015

L7. Diffie-Hellman (Key Exchange) Protocol. Rocky K. C. Chang, 5 March 2015 L7. Diffie-Hellman (Key Exchange) Protocol Rocky K. C. Chang, 5 March 2015 1 Outline The basic foundation: multiplicative group modulo prime The basic Diffie-Hellman (DH) protocol The discrete logarithm

More information

Asymmetric Encryption

Asymmetric Encryption -3 s s Encryption Comp Sci 3600 Outline -3 s s 1-3 2 3 4 5 s s Outline -3 s s 1-3 2 3 4 5 s s Function Using Bitwise XOR -3 s s Key Properties for -3 s s The most important property of a hash function

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

Public-Key Cryptosystems CHAPTER 4

Public-Key Cryptosystems CHAPTER 4 Public-Key Cryptosystems CHAPTER 4 Introduction How to distribute the cryptographic keys? Naïve Solution Naïve Solution Give every user P i a separate random key K ij to communicate with every P j. Disadvantage:

More information

Cryptanalysis of Patarin s 2-Round Public Key System with S Boxes (2R)

Cryptanalysis of Patarin s 2-Round Public Key System with S Boxes (2R) Cryptanalysis of Patarin s 2-Round Public Key System with S Boxes (2R) Eli Biham Computer Science Department Technion Israel Institute of Technology Haifa 32000, Israel biham@cs.technion.ac.il http://www.cs.technion.ac.il/~biham/

More information

Non-parametric methods

Non-parametric methods Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

CRYPTOGRAPHY AND NUMBER THEORY

CRYPTOGRAPHY AND NUMBER THEORY CRYPTOGRAPHY AND NUMBER THEORY XINYU SHI Abstract. In this paper, we will discuss a few examples of cryptographic systems, categorized into two different types: symmetric and asymmetric cryptography. We

More information

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur March 19, 2018 Modular Exponentiation Public key Cryptography March 19, 2018 Branch Prediction Attacks 2 / 54 Modular Exponentiation

More information

Introduction to Modern Cryptography. Benny Chor

Introduction to Modern Cryptography. Benny Chor Introduction to Modern Cryptography Benny Chor RSA: Review and Properties Factoring Algorithms Trapdoor One Way Functions PKC Based on Discrete Logs (Elgamal) Signature Schemes Lecture 8 Tel-Aviv University

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Introduction to Cryptography Lecture 13

Introduction to Cryptography Lecture 13 Introduction to Cryptography Lecture 13 Benny Pinkas June 5, 2011 Introduction to Cryptography, Benny Pinkas page 1 Electronic cash June 5, 2011 Introduction to Cryptography, Benny Pinkas page 2 Simple

More information

Security Protocols and Application Final Exam

Security Protocols and Application Final Exam Security Protocols and Application Final Exam Solution Philippe Oechslin and Serge Vaudenay 25.6.2014 duration: 3h00 no document allowed a pocket calculator is allowed communication devices are not allowed

More information

1 Number Theory Basics

1 Number Theory Basics ECS 289M (Franklin), Winter 2010, Crypto Review 1 Number Theory Basics This section has some basic facts about number theory, mostly taken (or adapted) from Dan Boneh s number theory fact sheets for his

More information

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test

More information

An Efficient and Secure Protocol for Privacy Preserving Set Intersection

An Efficient and Secure Protocol for Privacy Preserving Set Intersection An Efficient and Secure Protocol for Privacy Preserving Set Intersection PhD Candidate: Yingpeng Sang Advisor: Associate Professor Yasuo Tan School of Information Science Japan Advanced Institute of Science

More information

Lectures 1&2: Introduction to Secure Computation, Yao s and GMW Protocols

Lectures 1&2: Introduction to Secure Computation, Yao s and GMW Protocols CS 294 Secure Computation January 19, 2016 Lectures 1&2: Introduction to Secure Computation, Yao s and GMW Protocols Instructor: Sanjam Garg Scribe: Pratyush Mishra 1 Introduction Secure multiparty computation

More information

Definition: For a positive integer n, if 0<a<n and gcd(a,n)=1, a is relatively prime to n. Ahmet Burak Can Hacettepe University

Definition: For a positive integer n, if 0<a<n and gcd(a,n)=1, a is relatively prime to n. Ahmet Burak Can Hacettepe University Number Theory, Public Key Cryptography, RSA Ahmet Burak Can Hacettepe University abc@hacettepe.edu.tr The Euler Phi Function For a positive integer n, if 0

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

A Knapsack Cryptosystem Based on The Discrete Logarithm Problem

A Knapsack Cryptosystem Based on The Discrete Logarithm Problem A Knapsack Cryptosystem Based on The Discrete Logarithm Problem By K.H. Rahouma Electrical Technology Department Technical College in Riyadh Riyadh, Kingdom of Saudi Arabia E-mail: kamel_rahouma@yahoo.com

More information

Chapter 18 Resampling and Nonparametric Approaches To Data

Chapter 18 Resampling and Nonparametric Approaches To Data Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

= 1 i. normal approximation to χ 2 df > df

= 1 i. normal approximation to χ 2 df > df χ tests 1) 1 categorical variable χ test for goodness-of-fit ) categorical variables χ test for independence (association, contingency) 3) categorical variables McNemar's test for change χ df k (O i 1

More information

Exam Security January 19, :30 11:30

Exam Security January 19, :30 11:30 Exam Security January 19, 2016. 8:30 11:30 You can score a maximum of 100. Each question indicates how many it is worth. You are NOT allowed to use books or notes, or a (smart) phone. You may answer in

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Jay Daigle Occidental College Math 401: Cryptology

Jay Daigle Occidental College Math 401: Cryptology 3 Block Ciphers Every encryption method we ve studied so far has been a substitution cipher: that is, each letter is replaced by exactly one other letter. In fact, we ve studied stream ciphers, which produce

More information

Mathematics of Public Key Cryptography

Mathematics of Public Key Cryptography Mathematics of Public Key Cryptography Eric Baxter April 12, 2014 Overview Brief review of public-key cryptography Mathematics behind public-key cryptography algorithms What is Public-Key Cryptography?

More information

Theme : Cryptography. Instructor : Prof. C Pandu Rangan. Speaker : Arun Moorthy CS

Theme : Cryptography. Instructor : Prof. C Pandu Rangan. Speaker : Arun Moorthy CS 1 C Theme : Cryptography Instructor : Prof. C Pandu Rangan Speaker : Arun Moorthy 93115 CS 2 RSA Cryptosystem Outline of the Talk! Introduction to RSA! Working of the RSA system and associated terminology!

More information

Lecture 19: Public-key Cryptography (Diffie-Hellman Key Exchange & ElGamal Encryption) Public-key Cryptography

Lecture 19: Public-key Cryptography (Diffie-Hellman Key Exchange & ElGamal Encryption) Public-key Cryptography Lecture 19: (Diffie-Hellman Key Exchange & ElGamal Encryption) Recall In private-key cryptography the secret-key sk is always established ahead of time The secrecy of the private-key cryptography relies

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

An Overview of Homomorphic Encryption

An Overview of Homomorphic Encryption An Overview of Homomorphic Encryption Alexander Lange Department of Computer Science Rochester Institute of Technology Rochester, NY 14623 May 9, 2011 Alexander Lange (RIT) Homomorphic Encryption May 9,

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Comparison of two samples

Comparison of two samples Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly

More information

Gurgen Khachatrian Martun Karapetyan

Gurgen Khachatrian Martun Karapetyan 34 International Journal Information Theories and Applications, Vol. 23, Number 1, (c) 2016 On a public key encryption algorithm based on Permutation Polynomials and performance analyses Gurgen Khachatrian

More information

k-nearest Neighbor Classification over Semantically Secure Encry

k-nearest Neighbor Classification over Semantically Secure Encry k-nearest Neighbor Classification over Semantically Secure Encrypted Relational Data Reporter:Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU May 9, 2014 1 2 3 4 5 Outline 1. Samanthula B K, Elmehdwi

More information

ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET

ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET J. Korean Math. Soc. 46 (2009), No. 1, pp. 59 69 ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET Jae Hong Seo, HyoJin Yoon, Seongan Lim, Jung Hee Cheon, and Dowon Hong Abstract. The element

More information

Practice Assignment 2 Discussion 24/02/ /02/2018

Practice Assignment 2 Discussion 24/02/ /02/2018 German University in Cairo Faculty of MET (CSEN 1001 Computer and Network Security Course) Dr. Amr El Mougy 1 RSA 1.1 RSA Encryption Practice Assignment 2 Discussion 24/02/2018-29/02/2018 Perform encryption

More information

Lecture 9 - Symmetric Encryption

Lecture 9 - Symmetric Encryption 0368.4162: Introduction to Cryptography Ran Canetti Lecture 9 - Symmetric Encryption 29 December 2008 Fall 2008 Scribes: R. Levi, M. Rosen 1 Introduction Encryption, or guaranteeing secrecy of information,

More information

Logic gates. Quantum logic gates. α β 0 1 X = 1 0. Quantum NOT gate (X gate) Classical NOT gate NOT A. Matrix form representation

Logic gates. Quantum logic gates. α β 0 1 X = 1 0. Quantum NOT gate (X gate) Classical NOT gate NOT A. Matrix form representation Quantum logic gates Logic gates Classical NOT gate Quantum NOT gate (X gate) A NOT A α 0 + β 1 X α 1 + β 0 A N O T A 0 1 1 0 Matrix form representation 0 1 X = 1 0 The only non-trivial single bit gate

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Nonparametric Statistics Notes

Nonparametric Statistics Notes Nonparametric Statistics Notes Chapter 5: Some Methods Based on Ranks Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Ch 5: Some Methods Based on Ranks 1

More information

Privacy-Preserving Ridge Regression Without Garbled Circuits

Privacy-Preserving Ridge Regression Without Garbled Circuits Privacy-Preserving Ridge Regression Without Garbled Circuits Marc Joye NXP Semiconductors, San Jose, USA marc.joye@nxp.com Abstract. Ridge regression is an algorithm that takes as input a large number

More information

Private Comparison. Chloé Hébant 1, Cedric Lefebvre 2, Étienne Louboutin3, Elie Noumon Allini 4, Ida Tucker 5

Private Comparison. Chloé Hébant 1, Cedric Lefebvre 2, Étienne Louboutin3, Elie Noumon Allini 4, Ida Tucker 5 Private Comparison Chloé Hébant 1, Cedric Lefebvre 2, Étienne Louboutin3, Elie Noumon Allini 4, Ida Tucker 5 1 École Normale Supérieure, CNRS, PSL University 2 IRIT 3 Chair of Naval Cyber Defense, IMT

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information