Y-STR: Haplotype Frequency Estimation and Evidence Calculation
|
|
- Kellie Jefferson
- 5 years ago
- Views:
Transcription
1 Calculating evidence Further work Questions and Evidence Mikkel, MSc Student Supervised by associate professor Poul Svante Eriksen Department of Mathematical Sciences Aalborg University, Denmark June 16 th 2010, Presentation of Master of Science Thesis 1/75
2 Outline Outline Biological framework Motivation Aims 1. Short biological recap 2. Motivation and aims of using Y-STR 3. estimation 3.1 Methods and comments to the 3.2 Model control 4. Evidence calculation 5. Further work Calculating evidence Further work Questions 2/75
3 Biological framework Outline Biological framework Motivation Aims Calculating evidence Further work Questions Example of (autosomal) STR DNA-type: ({14, 13}, {22}, {15, 16},..., {22}) Example of Y-STR DNA-type: (17, 14, 22,..., 15) LHS image is from and RHS image is from 3/75
4 Motivation Outline Biological framework Motivation Aims Calculating evidence In some situations Y-STR is more sensible to use than A-STR (autosomal STR), e.g. to avoid noise in the trace from a rape victim Y-STR and A-STR differs in several areas, e.g. the number of alleles at each locus and statistical dependence between loci The statistical developed to handle A-STR cannot be applied on Y-STR directly, so reformulation is required (e.g. for calculating evidence) or new must be developed (to estimate Y-STR haplotype ) Further work Questions 4/75
5 Aims Outline Biological framework Motivation Aims Be able to calculate statistical evidence in trials Estimate for Y-STR haplotypes (also unobserved ones) is required to do this First, for estimating for Y-STR haplotypes will be discussed and afterwards calculation of evidence will be introduced. Calculating evidence Further work Questions 5/75
6 New Calculating 6/75
7 New Neither principal component analysis nor factor analysis yields good results, so that path has not been followed any further. Calculating 7/75
8 New Simple count estimates Not precise enough One published method (used at introduced in A new method for the evaluation of matches in non-recombining genomes: application to Y-chromosomal short tandem repeat (STR) haplotypes in European males. from 2000 by L. Roewer et al. Several problems exist; some will be presented in this presentation (some also presented in a talk at 7th International Y Chromosome User Workshop in Berlin, Germany, from April 22 to April 24, 2010) Calculating 8/75
9 New New Graphical would be an obvious choice Structure based learning (e.g. PC algorithm) or score based learning (e.g. AIC and BIC) Standard tests for conditional independence (e.g. G 2 = 2nCE(A, B {S i } i I ), where n is the sample size and CE is the cross entropy, which is χ 2 ν distributed when A and B are independent given {S i } i I ) do not exploit the ordering in the data nor does it incorporate prior knowledge such as the single step mutation model Better independence tests are required (e.g. classification trees, ordered logistic regression, and support vector machines) and model-based clustering Calculating 9/75
10 New Calculating 10/75
11 The idea: Bayesian approach New Notation: N: number of observations M: number of haplotypes (i.e. unique observations) N i : the number of times the i th haplotype has been observed f i = N i 1 N M : the frequency for the i th haplotype Model using Bayesian inference: 1. A priori: assume f i is Beta distributed with parameters non-stochastic parameters u i and v i 2. Likelihood: Given f i, then N i is Binomial distributed 3. Posterior: Given N i, then f i is (still) Beta distributed (Beta distribution is a conjugate prior for the Binomial distribution) Model expressed in densities using generic notation: p (f i N i ) p (N i f i ) p (f i ) Calculating 11/75
12 u i and v i New 1. Calculate W i = 1 N j N N i i j d ij for i = 1, 2,..., M, where d ij denotes the Manhattan distance/l 1 norm 2. Order the W i s by size and divide into 15 (?) groups and calculate the mean and variance of the f i s in each group 3. Fit regression µ(w ) = β 1 + exp (β 2 W + β 3 ) and σ 2 (W ) = β 4 + exp (β 5 W + β 6 ) based on the 15 estimates 4. Calculate µ i = µ(w i ) and σ 2 i = σ 2 (W i ) and use these to calculate the prior parameters u i and v i 5. Apply the Bayesian approach to obtain the posterier distribution, e.g. to estimate f i using the posterior mean Only dane could fit the full and the fit is not too comforting the others resulted in µ i < 0 or σi 2 < 0 for some i s Calculating 12/75
13 Plot of the regression New µ dane divided into 15 groups W µ(w) = exp( * W ) Calculating 13/75
14 Plot of the regression dane divided into 15 groups New σ W σ 2 (W) = 0 + exp( * W ) Calculating 14/75
15 Modified regression Set β 1 = β 4 = 0 in the regression yielding µ(w ) = exp (β 2 W + β 3 ) New og σ 2 (W ) = exp (β 5 W + β 6 ) Now berlin makes the best fits, which seems quite reasonable for µ(w ), but more doubtful for σ 2 (W ) dane fits almost as before Calculating 15/75
16 Plot of the regression New µ berlin divided into 16 groups W µ(w) = 0 + exp( * W ) Calculating 16/75
17 Plot of the regression New σ 2 0e+00 1e 04 2e 04 3e 04 4e 04 5e 04 berlin divided into 16 groups W σ 2 (W) = 0 + exp( * W ) Calculating 17/75
18 Plot of the regression New µ somali divided into 17 groups W µ(w) = 0 + exp( * W 9.286) Calculating 18/75
19 Plot of the regression New σ somali divided into 17 groups W σ 2 (W) = 0 + exp(10.29 * W 9.328) Calculating 19/75
20 Changes made on New At the 7 th International Y Chromosome User Workshop in Berlin, 2010, Sascha Willuweit (one of the persons behind mentioned a couple of changes between their implementation at and the original article: Using the reduced regression, i.e. without intercepts β 1 and β 4 The number of groups are determined by fitting several regressions and choosing the best one (details for selecting the minimum number of groups was not mentioned) Calculating 20/75
21 Comments and proposals for changes New Not a statistical model, more an ad-hoc method µ i = µ(w i ) = exp(aw i + b) is not bounded above such that µ(w ) 1 for W b a : for berlin a = and b = so µ i 1 for W i (0 W i 1 and 0 < µ i < 1 by definition) Fitting u i and v i : only W i to fit two exponential regression The model is not consistent: generalisation to a Dirichlet prior and a multinomial likelihood might solve this Calculating 21/75
22 New Calculating 22/75 Fitting two parameters (u i and v i ) using only one (W i ) fitted u vs. fitted v for berlin based on 16 groups Using regression without beta1 and beta4 u v
23 Fitting two parameters (u i and v i ) using only one (W i ) New v fitted u vs. fitted v for dane based on 15 groups u Using regression with beta1 and beta4 Calculating 23/75
24 Fitting two parameters (u i and v i ) using only one (W i ) fitted u vs. fitted v for dane based on 15 groups New v u Using regression without beta1 and beta4 Calculating 24/75
25 Fitting two parameters (u i and v i ) using only one (W i ) New v fitted u vs. fitted v for somali based on 17 groups u Using regression without beta1 and beta4 Calculating 25/75
26 Idea: use second moment W i = 1 N j N N i i j d ij = 1 N r j N N i i j k=1 d uses first ijk moment of the allele differences on each locus of r loci Maybe also use the second moment to introduce 1 N j Z i = N N i ) r i j k=1 (d ijk d 2 ij r New Fit µ i s and σi 2 s by multiple regression using a grid of W i and Z i values 15 groups only correspond to a 4 4-grid, which is way too coarse requiring 15 groups of W i s and Z i s, the grid would have size = 225: requires a lot of observations! Too few observations in berlin, dane, and somali, but it would be interesting to see how it would perform compared to just using the W i s Calculating 26/75
27 Generalised frequency New Assume a priori that f Dirichlet (α) where f = (f 1, f 2,..., f K ) is the vector of for all possible haplotypes Use the likelihood N f Multinomial (N +, f ) The posterior becomes f N Dirichlet (α 1 + N 1,..., α K + N K ) = Dirichlet (α 1 + N 1,..., α n + N n, α n+1,..., α K ) where f 1, f 2,..., f n are the for the observed haplotypes and f n+1, f n+2,..., f K are for the unobserved haplotypes Calculating 27/75
28 Marginal distribution New Let α + = K i=1 α i The marginal( posterior distribution for the i th haplotype is f i N i Beta α i + N i, ) K j=1 (α j + N j ) (α i + N i ) = Beta (α i + N i, α + α i + N + N i ) E [f i N i ] = α i +N i K j=1(α j +N j) (α i +N i )+(α i +N i ) = α i +N i α ++N + K i=1 E [f i N i ] = (α + + N + ) 1 K i=1 (α i + N i ) = 1 Incorporate prior knowledge can be done by specifying the prior parameter α i for all possible haplotypes, but with this approach α + might be problematic to calculate for large K Calculating 28/75
29 Approximating α + Histogram of Wi for berlin Gamma: s = , r = Beta: s1 = , s2 = New Density W mean(w) = Calculating 29/75
30 Approximating α + Histogram of Wi for dane Gamma: s = , r = Beta: s1 = , s2 = New Density W mean(w) = Calculating 30/75
31 Approximating α + New Density Histogram of Wi for somali W mean(w) = Gamma: s = 3.112, r = Beta: s1 = 2.44, s2 = Calculating 31/75
32 Approximating α + New As earlier stated, 0 W i 1 so the Beta distribution is the right choice theoretically Assume that α i = h(w i ) for some function h such that α + = K i=1 h(w i) Denote by f β the density of a fitted Beta-distribution, then α + K 1 0 f β (W ) h(w )dw Calculating 32/75
33 Approximating α + New To get equality between the first prior parameter in and the generalised, let α i = u i = µ2 i (1 µ i ) σi 2 Because µ i = µ(w i ) = exp(aw i + b) can result in µ i 1 then 1 µ i 0 such that α i 0 which is now allowed For berlin, µ i 1 for W i so that all contributions to the integral in the α + approximation is negative for W i berlin : α + = and the uncovered probability mass is estimated to dane : b/a = and α + = somali : b/a = and α + = α i = u i and the exponential regression is unusable, but the distribution of the W i s might be helpful for other choices Calculating 33/75
34 Problem New Maybe getting too much attention because it is the only published method for estimating haplotype At the 7 th International Y Chromosome User Workshop in Berlin, 2010, Michael Krawczak (one of the authors of the original articles) gave a talk where the associated slides included the statement [frequency has] never [been] thoroughly studied and validated Calculating 34/75
35 New Calculating 35/75
36 The idea: approximate when identified a common (partial) ancestor New The basic idea is to find I = {i 1, i 2,..., i q } {1, 2,..., r} such that ( ) P L j = a j L i = a i P L j = a j L i = a i j I is a good approximation. j I i I i I Calculating 36/75
37 Example New Assume that we have loci L 1, L 2, L 3, L 4 and I = {1, 2} Then P (L 1, L 2, L 3, L 4 ) = P (L 1 ) P (L 2 L 1 ) P (L 3, L 4 L 1, L 2 ) (1) 4 P (L 1 ) P (L 2 L 1 ) P (L j L 1, L 2 ) (2) j=3 Calculating 37/75
38 How to chose I New I is called an ancestral set, because it can be interpreted as a set of alleles that is common with one s ancestors I can be found using a greedy approach adding the j to I that maximises P (L j = a j i I L i = a i ) Stop adding elements to I, e.g. when only a percentage of the observations is left to use for calculating the marginal probabilities conditional on I Calculating 38/75
39 Drawbacks New The approach is simple, but like it is not a statistical model Calculating 39/75
40 New Calculating 40/75
41 The idea: alternately perceive different loci as a response variable New Let L 1, L 2,..., L r be the r different loci available in the haplotype. Then fit L i1 L k (3) L i2 k {i 1 } k {i 1,i 2 } L k (4). (5) L ir 2 L k (6) L ir 1 k {i 1,i 2,...,i r 2 } Calculating 41/75 k {i 1,i 2,...,i r 1 } and use the empirical distribution for L ir. L k = L ir (7)
42 Properties New A class of statistical The classifications can be done with some of the several available classifications such as classification trees, ordered logistic regression, or support vector machines Selection of i j should be done using standard model selection criteria depending on the classification model used Does not incorporate prior knowledge Calculating 42/75
43 and model based clustering New and model based clustering Calculating 43/75
44 The idea: create a density around each haplotype New Put a scaled density/mass (called a kernel) around each haplotype with mass equal to its relative frequency N i N + In this way unobserved haplotypes get probability mass from the (near) neighbours Calculating 44/75
45 Choice of kernel New A straightforward approach is the Gaussian kernel K (z x i, λ) = ( 2πλ 2 ) r 2 det (Σ) 1 2 exp ( 1 2λ 2 (x i z)σ 1 (x i z) ) where λ is called a parameter that has to be chosen A frequency estimate for any given haplotype z is g(z) = 1 N + n i=1 N ik (z x i, λ) To incorporate prior knowledge, K (z x i, N i, λ) = ( 2π λ2 2 ) r N i could be used det (Σ) 1 2 exp ( 1 2 λ2 N i (x i z)σ 1 (x i z) ) Calculating 45/75
46 Problem New The model can be inaccurate if the kernel has small variance, because then the actual mass when evaluated over the discrete grid can differ greatly from the relative Discrete kernels could be tried instead, e.g. the multinomial Calculating 46/75
47 Model based clustering New a frequency for a haplotype using kernel require evaluating as many densities as the number of haplotypes in the database Model based clustering can be used to perform clustering first to minimise the required number of density evaluations Same problem as with kernel if the variances are too small Calculating 47/75
48 Unobserved probability mass Marginal deviances Calculating evidence Further work Questions 48/75
49 Unobserved probability mass Marginal deviances Calculating evidence Model verification is as always crucial One important feature of a model is to be able to efficiently obtain further samples of haplotypes according to their probability under a model Further work Questions 49/75
50 Different ways of comparing Unobserved probability mass Marginal deviances Calculating evidence Estimated unobserved probability mass Marginal deviances (for a model s single and pairwise compared to observed) Several more should be definitely considered Further work Questions 50/75
51 Unobserved probability mass: point estimate Unobserved probability mass Marginal deviances Calculating evidence K 0 : the number of singletons N: the number of observations In 1968, Robbins showed that V = K 0 N + 1 is an unbiased estimate of the unobserved probability mass. Further work Questions 51/75
52 Unobserved probability mass: limiting consistent variance estimate Unobserved probability mass Marginal deviances Calculating evidence K 1 : the number of doubletons In 1986, Bickel and Yahav showed that under some regularity conditions, ˆσ 2 = K 0 N 2 (K 0 2K 1 ) 2 N 3 is limiting consistent estimate of the variance of the unobserved probability mass Both V and the variance estimate can be verified by simulation Further work Questions 52/75
53 Unobserved probability mass: simulation study Unobserved probability mass Marginal deviances Calculating evidence Further work Questions Coverage Confidence interval coverage for true population size Q = S S S S S S S S S S T T T T T T T T T T Symmetric standard confidence interval logit transformed confidence interval C C C C C C C C C C S S S S S S S S S S T T T T T T T T T T N Based on simulations and probabilities. Line at /75
54 Unobserved probability mass Unobserved probability mass Marginal deviances Calculating evidence The estimate seems like a good and simply way of perform model verification, but it cannot stand alone as we shall soon see It can also be used to fit model parameters, e.g. the parameter in the kernel model Further work Questions 54/75
55 Unobserved probability mass: approximate confidence intervals Unobserved probability mass Marginal deviances Calculating evidence berlin dane somali V ˆσ V 95% conf. int. [0.321; 0.408] [0.510; 0.694] [0.212; 0.342] S: β 1 /β 4 0 NA NA S: β 1 /β 4 = rpart svm polr NA Ancestor: 10% Ancestor: 15% Ancestor: 20% Further work Questions 55/75
56 Marginal deviances Unobserved probability mass Marginal deviances Calculating evidence Depending on the model, exact marginals can be difficult to obtain If haplotypes can be sampled according to their probability under a model, then marginals can be approximated by simulating a huge number of haplotypes under that model Using only the observed marginals should correspond to this, but at least for small databases this is not the case according to simulations studies performed with the classification Further work Questions 56/75
57 Deviance for pairwise marginals Unobserved probability mass Marginal deviances Calculating evidence Let {u} ij be the two-way table with the observation counts, { p} ij the table of predicted probabilities under a model M 0, {ˆp} ij the relative probabilities such that ˆp ij = u ij u ++ For the pairwise marginal tables, the deviance is ( d = 2 log L({ p} ij) L({ˆp} ij) ) where L ({p} ij ) = i,j pu ij ij proportional to the likelihood (the constant cancelled out in the fraction) Then d = 2 ( ) i,j u pij ij log ˆp ij χ 2 ν u ++! i,j u ij is is Further work Questions 57/75
58 Unobserved probability mass Marginal deviances Calculating evidence A deviance is calculated for each pair of loci To compare these deviances can summed and used for relative comparisons (the sum is not χ 2 distributed) The deviance is calculated similar for single marginals Further work Questions 58/75
59 Deviance sums for single marginals Unobserved probability mass Marginal deviances Calculating evidence berlin dane somali rpart svm polr NA Sum of deviances for observed single marginals vs. simulated single marginals for the classification method specified in each row. Further work Questions 59/75
60 Deviance sums for pairwise marginals Unobserved probability mass Marginal deviances Calculating evidence berlin dane somali rpart Inf svm Inf polr Inf NA Sum of deviances for observed pairwise marginals vs. simulated pairwise marginals for the classification method specified in each row. Further work Questions 60/75
61 Calculating evidence Calculating evidence Calculating evidence Two contributors n contributors Further work Questions 61/75
62 Evidence: motivation of usage The purpose is to get an unbiased opinion in a trial Often formulated as the hypothesis H p : The suspect left the crime stain together with n additional contributors. H d : Some other person left the crime stain together with n additional contributors. Calculating evidence Two contributors n contributors Further work Questions Then the likelihood ratio given by LR = P(E Hp) P(E H d ) is calculated In a courtroom it can then be stated that the evidence E is LR times more likely to have arisen under H p than under H d (formulation is from Interpreting DNA Mixtures by Weir et al., 1997) 62/75
63 Computational challenge The actual evaluation of LR can be a computation heavy task: the number of combinations giving rise to the same trace grows with the number of contributors Calculating evidence Two contributors n contributors Further work Questions 63/75
64 Two contributors Calculating evidence Two contributors n contributors Further work Questions Assume H p : The suspect left the crime stain together with one additional contributors. H d : Some other person left the crime stain together with one additional contributors. 64/75
65 Two contributors: notation Calculating evidence Two contributors n contributors Further work Let a = (a 1, a 2,..., a n ) and b = (b 1, b 2,..., b n ) and define T = a b = ({a 1, b 1 }, {a 2, b 2 },..., {a 1, b n }) (8) T a = ({b 1 }, {b 2 },..., {b n }) (9) where the sets are multisets. Questions 65/75
66 Two contributors Calculating evidence Two contributors n contributors Further work Questions Let T be the trace, h s the suspect s haplotype, and h 1 the additional contributor s haplotype. Then T = h s h 1 and h 1 = T h s. Say that (h 1, h 2 ) is consistent with the trace T if h 1 h 2 = T, which is denoted (h 1, h 2 ) T. This makes LR = P (E H p) P (E H d ) P (h s, T h s ) = (h 1,h 2 ) T P (h s, h 1, h 2 ) P (h s ) P (T h s ) = P (h s ) (h 1,h 2 ) T P (h 1) P (h 2 ) P (T h s ) = (h 1,h 2 ) T P (h 1) P (h 2 ) (10) (11) (12) (13) by assuming that haplotypes are independent. 66/75
67 Two contributors: number of terms in the denominator Calculating evidence Two contributors n contributors Further work Let T = (T 1, T 2,..., T r ), T i is a set of alleles such that T i {1, 2} Let H T = T 1 T 2 T r In the non-trivial case a j {1, 2,..., r} exists such that T j = {a 1, a 2 } with a 1 a 2 Let T j = {a 1 } (such that one of the alleles is removed) and H T = T 1 T j 1 T j T j+1 T r Questions 67/75
68 Two contributors: number of terms in the denominator Calculating evidence Two contributors n contributors Further work Questions Now the denominator of LR can be written as 2 h 1 H T P (h 1 ) P (T h 1 ) If k denotes the number of loci in the trace with only one allele, and we assume that we have the non-trivial case with 0 k < r, we have that H T = r i=1 T i = 2 r k such that H T = H T 2 = 2 r k 1 2 r 1 This means that for r loci, a maximum of 2 2 r 1 = 2 r haplotype have to be calculated, e.g = 1024 If a trace has two contributors with no known suspects, the two most likely contributors can be chosen to be arg max h1 H T P (h 1) P (T h 1 ) 68/75
69 n contributors: formulation of the LR for one locus Calculating evidence Two contributors n contributors Further work Questions LR defined generally in Forensic interpretation of Y-chromosomal DNA mixtures by Wolf et al., 2005 At a given locus, let E t, E s, and E k be the set of alleles from the trace, the suspect, and the known contributors, respectively Assume n unknown contributors and let A n denote the set of alleles carried by the these n unknown contributors Let P n (V ; W ) = P (W A n V ) Then LR = P n (E t ; E t \ (E s E k )) P n+1 (E t ; E t \ E k ) = P (E t \ (E s E k ) A n E t ) P (E t \ E k A n+1 E t ) (14) (15) 69/75
70 n contributors: formulation of the LR for m loci Calculating evidence Two contributors n contributors Further work For m loci we have ( m P n {E t,i ; E t,i \ ( ) ) E s,i E k,i } LR = i=1 ( m ) P n+1 {E t,i ; E t,i \ E k,i } i=1 Questions 70/75
71 n contributors: example from thesis, page 18 Calculating evidence Two contributors n contributors Further work Questions Let E t,1 = {1, 2}, E t,2 = {2}, E s,1 = {1}, E s,2 = {2}, E k,1 = E k,2 =. Then ( 2 P 1 i=1 {E t,i ; E t,i \ ( ) ) E s,i E k,i } LR = ( 2 ) P 2 i=1 {E t,i ; E t,i \ E k,i } ( 2 P i=1 {E t,i \ ( ) ) E s,i E k,i A i 1 E t,i = ( 2 ) P i=1 {E t,i \ E k,i A i 2 E t,i } = P ( {E t,1 \ E s,1 A 1 1 E t,1} {E t,2 \ E s,2 A 2 1 E ) t,2 P ( {E t,1 \ E k,1 A 1 2 E t,1} {E t,2 \ E k,2 A 2 2 E t,2} ) = P ( {{2} 1 A 1 1 {1, 2}1 } {A 2 1 {2}2 } ) P ( {{1, 2} 1 A 1 2 {1, 2}1 } {{2} 2 A 2 2 {2}2 } ) = P ( {2} 1 {2} 2) P ({1, 2} 1 {2, 2} 2 ) = P (h 1 = (2, 2)) P (h 1 = (1, 2), h 2 = (2, 2)) + P (h 1 = (2, 2), h 2 = (1, 2)) 71/75
72 n contributors: challenges Calculating evidence Two contributors n contributors Further work It is complicated One key ingredient in calculating the LR is to be able to estimate for unobserved haplotypes The next step is to be able to calculate the LR efficiently even for a large number of contributors Questions 72/75
73 Further work Further work Calculating evidence Further work Questions 73/75
74 Further work Calculating evidence Further work Questions Y-STR haplotype Better incorporation of prior knowledge in a statistical model, e.g. graphical with other test statistics (ordinal data and incorporating prior knowledge such as the single step mutation model) More and better ways to verify Larger datasets ( has gathered a lot of data, both publicly available in journals and directly from laboratories, but none is available for others, yet) Y-STR Mixtures Efficient calculation of LR Use quantitative information (the amount of DNA material which can be seen in the EPG) instead of only the qualitative Model the signal in the EPG (electropherogram) 74/75
75 Questions? Questions? Calculating evidence Further work Questions 75/75
The problem of Y-STR rare haplotype
The problem of Y-STR rare haplotype February 17, 2014 () Short title February 17, 2014 1 / 14 The likelihood ratio When evidence is found on a crime scene, the expert is asked to provide a Likelihood ratio
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationStochastic Models for Low Level DNA Mixtures
Original Article en5 Stochastic Models for Low Level DNA Mixtures Dalibor Slovák 1,, Jana Zvárová 1, 1 EuroMISE Centre, Institute of Computer Science AS CR, Prague, Czech Republic Institute of Hygiene
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationA gamma model for DNA mixture analyses
Bayesian Analysis (2007) 2, Number 2, pp. 333 348 A gamma model for DNA mixture analyses R. G. Cowell, S. L. Lauritzen and J. Mortera Abstract. We present a new methodology for analysing forensic identification
More informationCluster analysis of Y-chromosomal STR population data using discrete Laplace distributions
25th ISFG Congress, Melbourne, 2013 1,, Poul Svante Eriksen 1 and Niels Morling 2 1 Department of Mathematical,, 2 Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical,
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationThe Monte Carlo Method: Bayesian Networks
The Method: Bayesian Networks Dieter W. Heermann Methods 2009 Dieter W. Heermann ( Methods)The Method: Bayesian Networks 2009 1 / 18 Outline 1 Bayesian Networks 2 Gene Expression Data 3 Bayesian Networks
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationGenerative Clustering, Topic Modeling, & Bayesian Inference
Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationContrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:
Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationAdventures in Forensic DNA: Cold Hits, Familial Searches, and Mixtures
Adventures in Forensic DNA: Cold Hits, Familial Searches, and Mixtures Departments of Mathematics and Statistics Northwestern University JSM 2012, July 30, 2012 The approach of this talk Some simple conceptual
More informationCS-E3210 Machine Learning: Basic Principles
CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression
More informationExponential Families
Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV
More informationClick Prediction and Preference Ranking of RSS Feeds
Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationBayesian Approach 2. CSC412 Probabilistic Learning & Reasoning
CSC412 Probabilistic Learning & Reasoning Lecture 12: Bayesian Parameter Estimation February 27, 2006 Sam Roweis Bayesian Approach 2 The Bayesian programme (after Rev. Thomas Bayes) treats all unnown quantities
More informationLearning from Data: Regression
November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationMachine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9
Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Slides adapted from Jordan Boyd-Graber Machine Learning: Chenhao Tan Boulder 1 of 39 Recap Supervised learning Previously: KNN, naïve
More informationClustering bi-partite networks using collapsed latent block models
Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4
ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationSpatial Normalized Gamma Process
Spatial Normalized Gamma Process Vinayak Rao Yee Whye Teh Presented at NIPS 2009 Discussion and Slides by Eric Wang June 23, 2010 Outline Introduction Motivation The Gamma Process Spatial Normalized Gamma
More informationFREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE
FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More informationGenerative Models for Discrete Data
Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers
More informationMultiple Sequence Alignment using Profile HMM
Multiple Sequence Alignment using Profile HMM. based on Chapter 5 and Section 6.5 from Biological Sequence Analysis by R. Durbin et al., 1998 Acknowledgements: M.Sc. students Beatrice Miron, Oana Răţoi,
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationA Bayesian model for event-based trust
A Bayesian model for event-based trust Elements of a foundation for computational trust Vladimiro Sassone ECS, University of Southampton joint work K. Krukow and M. Nielsen Oxford, 9 March 2007 V. Sassone
More informationPart III: Unstructured Data. Lecture timetable. Analysis of data. Data Retrieval: III.1 Unstructured data and data retrieval
Inf1-DA 2010 20 III: 28 / 89 Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval Statistical Analysis of Data: III.2 Data scales and summary statistics III.3 Hypothesis
More informationIntroduction to Machine Learning
1, DATA11002 Introduction to Machine Learning Lecturer: Antti Ukkonen TAs: Saska Dönges and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer,
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationStatistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames
Statistical Methods in HYDROLOGY CHARLES T. HAAN The Iowa State University Press / Ames Univariate BASIC Table of Contents PREFACE xiii ACKNOWLEDGEMENTS xv 1 INTRODUCTION 1 2 PROBABILITY AND PROBABILITY
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming
More informationMore Spectral Clustering and an Introduction to Conjugacy
CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More informationLab 12: Structured Prediction
December 4, 2014 Lecture plan structured perceptron application: confused messages application: dependency parsing structured SVM Class review: from modelization to classification What does learning mean?
More informationAdvanced topics from statistics
Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine
Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine Mike Tipping Gaussian prior Marginal prior: single α Independent α Cambridge, UK Lecture 3: Overview
More informationDreem Challenge report (team Bussanati)
Wavelet course, MVA 04-05 Simon Bussy, simon.bussy@gmail.com Antoine Recanati, arecanat@ens-cachan.fr Dreem Challenge report (team Bussanati) Description and specifics of the challenge We worked on the
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationStatistics 135 Fall 2007 Midterm Exam
Name: Student ID Number: Statistics 135 Fall 007 Midterm Exam Ignore the finite population correction in all relevant problems. The exam is closed book, but some possibly useful facts about probability
More informationProteomics and Variable Selection
Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial
More informationInterpreting DNA Mixtures in Structured Populations
James M. Curran, 1 Ph.D.; Christopher M. Triggs, 2 Ph.D.; John Buckleton, 3 Ph.D.; and B. S. Weir, 1 Ph.D. Interpreting DNA Mixtures in Structured Populations REFERENCE: Curran JM, Triggs CM, Buckleton
More informationLecture 1 Bayesian inference
Lecture 1 Bayesian inference olivier.francois@imag.fr April 2011 Outline of Lecture 1 Principles of Bayesian inference Classical inference problems (frequency, mean, variance) Basic simulation algorithms
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationBayesian Methods with Monte Carlo Markov Chains II
Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationLogistic Regression Logistic
Case Study 1: Estimating Click Probabilities L2 Regularization for Logistic Regression Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 10 th,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationThe statistical evaluation of DNA crime stains in R
The statistical evaluation of DNA crime stains in R Miriam Marušiaková Department of Statistics, Charles University, Prague Institute of Computer Science, CBI, Academy of Sciences of CR UseR! 2008, Dortmund
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More information