The exam is closed book, closed notes except your one-page cheat sheet.

Size: px
Start display at page:

Download "The exam is closed book, closed notes except your one-page cheat sheet."

Transcription

1 CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces s forbdden If we see you usng an electronc devce (phone, laptop, etc) you wll get a zero You have 3 hours Wrte your ntals at the top rght of each page (eg, wrte BR f you are Ben Recht) Mark your answers on the exam tself n the space provded Do not attach any extra sheets Frst name Last name SID Frst and last name of student to your left Frst and last name of student to your rght

2 Q [0 pts] Inverse Gaussan Dstrbuton The nverse Gaussan dstrbuton, denoted IG(µ, λ), s a contnuous dstrbuton supported on (0, ) and parameterzed by two scalars µ > 0 and λ > 0 For any x (0, ), the probablty densty functon f(x; µ, λ) s gven by } λ f(x; µ, λ) = { 2πx 3 exp λ(x µ)2 2µ 2 x (a) [3 pts] Suppose we draw n ndependent samples x,, x n from IG(µ, λ) Wrte down the log-lkelhood L(x,, x n ; µ, λ) We have that for x (0, ), Hence, L(x,, x n ; µ, λ) = log f(x; µ, λ) = 2 log λ 2 log 2πx3 λ(x µ)2 2µ 2 x log f(x ; µ, λ) = n n 2 log λ 2 log 2πx3 λ(x µ) 2 2µ 2 x (b) [2 pts] Is the functon φ : (0, ) R defned as φ(λ) = L(x,, x n ; µ, λ) convex, concave, or both? No need to justfy your answer Concave φ(λ) = a log λ + bλ + c for scalars a, b, c wth a > 0, and log λ s concave on (0, ) (c) [5 pts] Now assume that µ s known Derve the maxmum lkelhood estmator λ gven n ndependent samples x,, x n from IG(µ, λ) Solvng for the statonary pont of d dλ L(x,, x n ; µ, λ) = 0, 0 = d dλ L(x,, x n ; µ, λ) = n 2λ n (x µ) 2 2µ 2 x = λ = n (x µ) 2 µ 2 x Snce the functon φ(λ) s concave on (0, λ) and the proposed maxmzer s postve wth probablty one, then the soluton to d dλ L(x,, x n ; µ, λ) = 0 s both a necessary and suffcent condton for a global maxmum 2

3 Q2 [25 pts] Multvarate Gaussans Let x be a d-dmensonal random vector dstrbuted as a Gaussan wth mean µ R d and postve defnte covarance matrx Σ R d d : that s, x N(µ, Σ) (a) [5 pts] Let v R d What are the mean and varance of v T x? v T µ and v T Σv (b) [5 pts] What s the dstrbuton of v T x? N(v T µ, v T Σv) (c) [5 pts] Suppose we get to choose v from the unt sphere (e v 2 = ) Wrte down a (unt-norm) choce of v that maxmzes the varance of v T x Leadng egenvector of Σ (d) [5 pts] Let z N(0, I d ) Fnd a matrx A and a vector b such that Az + b has the same dstrbuton as x A = Σ /2, b = µ (e) [5 pts] Now fnd a matrx A and a vector b such that Ax + b has the same dstrbuton as z A = Σ /2, b = Σ /2 µ 3

4 Q3 [20 pts] Data Geometry Consder the followng two-dmensonal data set: Note that the mean of ths data set s 0 (a) [5 pts] Suppose someone assgns a label y = + or y = to every data pont and we solve a Support Vector Machne (SVM) classfcaton problem: mnmze w R 2,b R n n max( y (w T x + b), 0) + λ w 2 2 Here, b s an unregularzed bas term, and λ > 0 Show that both components of w must be equal to each other The optmal w must le n the span of the data Therefore, snce all of the data ponts have both components equal to each other, w wll as well (b) [5 pts] Suppose λ = 0 Gven the nformaton provded, what can be sad about the possble values for w? The optmal w could be any drecton n R 2 based on the nformaton provded Wthout regularzaton, w can roam free (c) [5 pts] Draw the prncpal components of ths data set on the plot Soluton: (d) [5 pts] What s the rato of the smallest egenvalue to the largest egenvalue of the covarance matrx of ths data? 0 4

5 Q4 [20 pts] Resdual Neural Network Consder the followng neural network, whch operates on scalars x z a z 2 a 2 z 3 ŷ w w 2 w 3 In ths network, w, w 2, and w 3 are scalars The network takes the scalar x as nput, and computes z = w x The ReLU nonlnearty s then appled: a = ReLU(z ) Next, the network computes z 2 = w 2 a and apples the ReLU nonlnearty: a 2 = ReLU(z 2 ) Next, z 3 = a + a 2 Fnally, ŷ = w 3 z 3 We wll use the mean squared error loss functon L = 2 (y ŷ)2 to tran ths network You may use R(x) and R (x) to denote ReLU and the dervatve of ReLU respectvely (a) [5 pts] Fnd dl dw 3 dl dw 3 = (y ŷ)z 3 (b) [5 pts] Fnd dl dw 2 dl dw 2 = (y ŷ)w 3 dr(w 2 a ) dw 2 = (y ŷ)w 3 R (z 2 )a (c) [5 pts] Fnd dl dw dl dw = (y ŷ)w 3 d(r(w x) + R(w 2 R(w x))) dw = (y ŷ)w 3 (R (w x)x + R (w 2 R(w x))w 2 R (w x)x) = (y ŷ)w 3 R (z )x( + R (z 2 )w 2 ) (d) [5 pts] How would the network change f we added a ReLU nonlnearty to unt z 3 such that a 3 = ReLU(z 3 ), ŷ = w 3 a 3 Brefly explan your reasonng We already have that a 0, a 2 0, so t would be redundant to apply ReLU to z 3 5

6 Q5 [35 pts] Feature Engneerng In ths class, we have tred to emphasze the mportance of good features In ths queston, we ll take ths a step further and see that good features need to be predctve on the tranng set, but also generalze well to unseen data Let n denote the number of UC Berkeley students, and let us assgn an arbtrary orderng to all students Let y, y 2,, y n R denote the ages of all the students Our goal s to predct the age y of a new transfer student on campus based on some lst of features A natural feature choce could be, for example, year of study But suppose we only have access to student ID (SID) numbers You are a bt unsure how to use ths nformaton, so you ask some of your frends Note: For ths queston, assume that n < 0 9 One-hot encodng You frst ask your frend from Stanford how to encode a student ID nto a feature vector Use one-hot encodng, your frend responds You are a bt skeptcal, but you proceed onwards Snce there are 9 dgts n an SID, you one-hot encode each SID nto an element of {0, } d where d = 0 9 Let x,, x n R d denote the encoded features of the tranng set stacked row-wse nto a matrx X R n d, and Y = (y,, y n ) R n denote the age vector of students n the tranng set (note that n < d) You proceed to use regularzed least squares and solve mnmze w R d 2 Xw Y λ 2 w 2 2 () (a) [3 pts] How does the regularzaton term ( λ 2 w 2 2) ensure a unque soluton to ()? The matrx X s of dmenson n 0 9 wth n < 0 9, so the matrx X has a non-trval null-space (b) [5 pts] Compute the soluton w R 09 to (), assumng that x = e R d where e denotes the -th standard bass vector n R d That s, someone sorted your data such that the -th student has SID Your answer should only be expressed n terms of Y and λ Hnt: You may perform ths calculaton ether drectly, or usng the kernel trck (Drect) We know that w = (X T X + λi d ) X T Y Hence, X = [ I n 0 n,d n ], X T X = [ ] e e T In 0 = n,d n 0 d n,n 0 d n,d n Therefore, w = (X T X + λi n ) X T Y = = [ +λ I n 0 n,d n 0 d n,n λ I d n,d n ] [ [ +λ I m 0 d n,n Y = +λ Y 0 d n ] [ In ] 0 d n,n ] Y (Kernel trck) It s not hard to see that the gram matrx XX [ T = I n Recall that w = X T α, and α = (XX T + λi n ) Y Clearly then α = +λ Y and w = X T α = +λ Y ] 0 d n (c) [3 pts] Compute predct(x ) = w, x for x n the tranng set predct(x ) = + λ y (d) [3 pts] Suppose x s the feature vector for the transfer student who s not n the tranng set predct(x) = w, x for x not n the tranng set Compute 6

7 Snce x s not n the tranng set, ts one-hot encodng s orthogonal to every feature vector n the tranng set Thus predct(x) = 0 Logcal features Unsatsfed wth the performance of one-hot encodng, you ask another frend from MIT what features to use Your frend tells you to use the feature that s equal to f the student goes to Berkeley, and 0 otherwse You are even more skeptcal, but you proceed onwards (e) [5 pts] Usng the features x = for all n, wrte the soluton w to the followng unregularzed problem: mnmze w R 2 Xw Y 2 2 Our features n ths case are scalars We know that X = n and hence X T X = T n n = n The soluton s smply w = n n y (f) [3 pts] Now compute predct(x ) = w, x usng the w from part (e) for x n the tranng set predct(x ) = n y (g) [3 pts] Snce the transfer student s a Berkeley student, the value of ther feature wll be x = Compute predct(x) = w, x usng the w when x s the transfer student who s not n the tranng set Agan, predct(x ) = n y 7

8 Combnng features Stll unsatsfed wth prevous suggestons, we consder combnng both the one-hot encoded features and the constant feature (h) [0 pts] We now consder x = (e, ) R d+, where e R d as before For smplcty, we set λ = 0, and consder the followng problem mnmze w R d+ 2 Xw Y 2 2 Compute a mnmzer w of ths problem, and predct(x) for x n and not n the tranng set Hnt: The Sherman-Morrson formula states that for nvertble A and column vectors u, v such that + v T A u 0, (A + uv T ) = A A uv T A + v T A u Snce we are allowed to compute any mnmzer, t s easest to compute the mnmum norm one by employng the kernel trck Observe that By the Sherman-Morrson formula, we have that X = [ I n 0 n,d n n ], XX T = I n + n T n (XX T ) = (I n + n T n) = I n n T n + n Hence, α = (XX T ) Y = ( I n n T ) n Y + n Furthermore, the mnmum norm mnmzer w s gven by ( I n ( w = X T α = 0 d n,n I n n T ) n Y = T + n n I n nt n +n 0 d n ) Y T ny n n+ T ny = We are now n a poston to compute the predct(x) functon On the tranng set For x not n the tranng set, predct(x ) = y predct(x) = n + y ( I n nt n +n 0 d n n+ n y ) Y 8

9 Q6 [20 pts] Calbraton Suppose we attempt to solve a bnary classfcaton task wth bnary feature vectors Specfcally, we are gven n data ponts {(x, y )} n wth x {0, } d (e each of the d feature values s ether 0 or ) and y {0, } Suppose we try to ft the standard lnear logstc regresson model: P (y = x, w, b) = g(w T x + b) = + exp( w T x b) We say that a model s calbrated f n P (y = x, w, b) = n y That s, f the average probablty of labelng the data n class s equal to the average number of occurrences of class on the tranng set Calbrated models are nterestng because ther predcted probabltes are more useful as an ndcator of confdence Consder tranng the logstc regresson model where we solve the optmzaton problem maxmze w Rd,b R n log P (y x, w, b) Let w, b denote the maxmzng parameters (a) [5 pts] Show that the logstc model correspondng to the optmal w, b s calbrated Note that the log-lkelhood can be wrtten as y (w T x + b) log( + exp(w T x + b)) Takng a dervatve wth respect to b shows that b must satsfy 0 = { } y exp(wt x + b ) + exp(w T = x + b ) y P (y = x, w, b ) Another (smpler) method: Denote L as the log-lkelhood, µ = P (y = x, w, b) Reformulate w T x + b as β T x, wth β = [w, ] T, x := [x, ] T R d+ From lecture/homework we know: β L = (y µ )x = 0 Notng that x,d = (the bas component added to the end of x), we have: y µ = 0 = y = µ Many students made the mstake of clamng that under the optmal w, b, P (y x, w, b) = y Ths s only true f the data s lnearly separable, and even then t s only true as w (b) [5 pts] Let S j denote the set of ndces where X j = We say a model s condtonally calbrated wth respect to feature j f P (y = x, w, b) = y S j S j S j S j 9

10 Show that the logstc model correspondng to the optmal w, b s calbrated wth respect to every feature Takng a dervatve wth respect to w j shows that w,j must satsfy 0 = { y x j x j exp(w T } x + b ) + exp(w T = x + b ) y x j x j P (y = x, w, b ) = S j y S j P (y = x, w, b ) Here, the last lne follows because x j = only f S j, and otherwse x j = 0 Another (smpler) method: Usng the same notaton as the smpler method n the prevous part, we see that: (y µ )x = 0 = (y µ )x j = 0, j = y µ = 0 S j (c) [5 pts] Suppose we add a regularzer for the w parameters to the optmzaton problem: maxmze w Rd,b R n log P (y x, w, b) + λ w 2 2 Is the logstc model correspondng to the optmal w, b calbrated? Is t calbrated wth respect to every feature? The model remans calbrated, as the gradent for b does not change, and hence the same calculaton as n part (a) apples However, for w, we now have cost w j = S j y S j P (y = x, w, b ) + w j whch wll not be calbrated wth respect to feature j when w j 0 (d) [5 pts] Suppose someone gves you a classfyng functon f : R d R that nputs x and outputs a real number We can produce a probablty from ths classfer by settng P (y = x, f, α, β) = + exp(αf(x) + β) Show that one can always fnd values α and β such that the resultng probablty estmates are calbrated If one fts the log lkelhood over the data, the optmalty condtons wth respect to β wll guarantee calbraton Another way to thnk about ths: you can thnk of f as essentally a feature transformaton Replace your entre tranng set wth f(x) Then tran logstc regresson on ths tranng set The optmalty condtons we already proved guarantee calbraton 0

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values Fall 007 Soluton to Mdterm Examnaton STAT 7 Dr. Goel. [0 ponts] For the general lnear model = X + ε, wth uncorrelated errors havng mean zero and varance σ, suppose that the desgn matrx X s not necessarly

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system. Chapter Matlab Exercses Chapter Matlab Exercses. Consder the lnear system of Example n Secton.. x x x y z y y z (a) Use the MATLAB command rref to solve the system. (b) Let A be the coeffcent matrx and

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Math1110 (Spring 2009) Prelim 3 - Solutions

Math1110 (Spring 2009) Prelim 3 - Solutions Math 1110 (Sprng 2009) Solutons to Prelm 3 (04/21/2009) 1 Queston 1. (16 ponts) Short answer. Math1110 (Sprng 2009) Prelm 3 - Solutons x a 1 (a) (4 ponts) Please evaluate lm, where a and b are postve numbers.

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

14 Lagrange Multipliers

14 Lagrange Multipliers Lagrange Multplers 14 Lagrange Multplers The Method of Lagrange Multplers s a powerful technque for constraned optmzaton. Whle t has applcatons far beyond machne learnng t was orgnally developed to solve

More information

CSE 546 Midterm Exam, Fall 2014(with Solution)

CSE 546 Midterm Exam, Fall 2014(with Solution) CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Generative classification models

Generative classification models CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Feb 14: Spatial analysis of data fields

Feb 14: Spatial analysis of data fields Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

15-381: Artificial Intelligence. Regression and cross validation

15-381: Artificial Intelligence. Regression and cross validation 15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput

More information

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan. THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY Wllam A. Pearlman 2002 References: S. Armoto - IEEE Trans. Inform. Thy., Jan. 1972 R. Blahut - IEEE Trans. Inform. Thy., July 1972 Recall

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor Prncpal Component Transform Multvarate Random Sgnals A real tme sgnal x(t can be consdered as a random process and ts samples x m (m =0; ;N, 1 a random vector: The mean vector of X s X =[x0; ;x N,1] T

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above The conjugate pror to a Bernoull s A) Bernoull B) Gaussan C) Beta D) none of the above The conjugate pror to a Gaussan s A) Bernoull B) Gaussan C) Beta D) none of the above MAP estmates A) argmax θ p(θ

More information

CSCE 790S Background Results

CSCE 790S Background Results CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Quantum Mechanics I - Session 4

Quantum Mechanics I - Session 4 Quantum Mechancs I - Sesson 4 Aprl 3, 05 Contents Operators Change of Bass 4 3 Egenvectors and Egenvalues 5 3. Denton....................................... 5 3. Rotaton n D....................................

More information

Fisher Linear Discriminant Analysis

Fisher Linear Discriminant Analysis Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information