Latent Variable Models with Gaussian Processes

Size: px
Start display at page:

Download "Latent Variable Models with Gaussian Processes"

Transcription

1 Latent Variable Models with Gaussian Processes Neil D. Lawrence GP Master Class 6th February 2017

2 Outline Motivating Example Linear Dimensionality Reduction Non-linear Dimensionality Reduction

3 Outline Motivating Example Linear Dimensionality Reduction Non-linear Dimensionality Reduction

4 Motivation for Non-Linear Dimensionality Reduction USPS Data Set Handwritten Digit 3648 Dimensions 64 rows by 57 columns

5 Motivation for Non-Linear Dimensionality Reduction USPS Data Set Handwritten Digit 3648 Dimensions 64 rows by 57 columns Space contains more than just this digit.

6 Motivation for Non-Linear Dimensionality Reduction USPS Data Set Handwritten Digit 3648 Dimensions 64 rows by 57 columns Space contains more than just this digit. Even if we sample every nanosecond from now until the end of the universe, you won t see the original six!

7 Motivation for Non-Linear Dimensionality Reduction USPS Data Set Handwritten Digit 3648 Dimensions 64 rows by 57 columns Space contains more than just this digit. Even if we sample every nanosecond from now until the end of the universe, you won t see the original six!

8 Simple Model of Digit Rotate a Prototype

9 Simple Model of Digit Rotate a Prototype

10 Simple Model of Digit Rotate a Prototype

11 Simple Model of Digit Rotate a Prototype

12 Simple Model of Digit Rotate a Prototype

13 Simple Model of Digit Rotate a Prototype

14 Simple Model of Digit Rotate a Prototype

15 Simple Model of Digit Rotate a Prototype

16 Simple Model of Digit Rotate a Prototype

17 MATLAB Demo demdigitsmanifold([1 2], all )

18 MATLAB Demo demdigitsmanifold([1 2], all ) PC no PC no 1

19 MATLAB Demo demdigitsmanifold([1 2], sixnine ) PC no PC no 1

20 Low Dimensional Manifolds Pure Rotation is too Simple In practice the data may undergo several distortions. e.g. digits undergo thinning, translation and rotation. For data with structure : we expect fewer distortions than dimensions; we therefore expect the data to live on a lower dimensional manifold. Conclusion: deal with high dimensional data by looking for lower dimensional non-linear embedding.

21 Outline Motivating Example Linear Dimensionality Reduction Non-linear Dimensionality Reduction

22 Notation q dimension of latent/embedded space p dimension of data space n number of data points data, Y = [ y 1,:,..., y n,: ] = [ y :,1,..., y :,p ] R n p centred data, Ŷ = [ ŷ 1,:,..., ŷ n,: ] = [ ŷ :,1,..., ŷ :,p ] R n p, ŷ i,: = y i,: µ latent variables, X = [ x 1,:,..., x n,: ] = [ x :,1,..., x :,q ] R n q mapping matrix, W R p q a i,: is a vector from the ith row of a given matrix A a :,j is a vector from the jth row of a given matrix A

23 Reading Notation X and Y are design matrices Data covariance given by 1 nŷ Ŷ cov (Y) = 1 n n ŷ i,: ŷ i,: = nŷ 1 Ŷ = S. i=1 Inner product matrix given by YY K = ( k i,j )i,j, k i,j = y i,: y j,:

24 Linear Dimensionality Reduction Find a lower dimensional plane embedded in a higher dimensional space. The plane is described by the matrix W R p q. x2 y = Wx + µ x 1 y 2 y 3 y 1 Figure: Mapping a two dimensional plane to a higher dimensional space in a linear way. Data are generated by corrupting points on the plane with noise.

25 Linear Dimensionality Reduction Linear Latent Variable Model Represent data, Y, with a lower dimensional set of latent variables X. Assume a linear relationship of the form y i,: = Wx i,: + ɛ i,:, where ɛ i,: N ( 0, σ 2 I ).

26 Linear Latent Variable Model Probabilistic PCA Define linear-gaussian relationship between latent variables and data. W Y X σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1

27 Linear Latent Variable Model Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Standard Latent variable approach: W Y X σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1

28 Linear Latent Variable Model Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Standard Latent variable approach: Define Gaussian prior over latent space, X. X W Y σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1 n p (X) = N ( x i,: 0, I ) i=1

29 Linear Latent Variable Model Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Standard Latent variable approach: Define Gaussian prior over latent space, X. Integrate out latent variables. X W Y σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1 n p (X) = N ( x i,: 0, I ) i=1 n p (Y W) = N ( y i,: 0, WW + σ 2 I ) i=1

30 Computation of the Marginal Likelihood y i,: = Wx i,: + ɛ i,:, x i,: N (0, I), ɛ i,: N ( 0, σ 2 I )

31 Computation of the Marginal Likelihood y i,: = Wx i,: + ɛ i,:, x i,: N (0, I), ɛ i,: N ( 0, σ 2 I ) Wx i,: N ( 0, WW ),

32 Computation of the Marginal Likelihood y i,: = Wx i,: + ɛ i,:, x i,: N (0, I), ɛ i,: N ( 0, σ 2 I ) Wx i,: N ( 0, WW ), Wx i,: + ɛ i,: N ( 0, WW + σ 2 I )

33 Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) W Y σ 2 p (Y W) = n N ( y i,: 0, WW + σ 2 I ) i=1

34 Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y W) = n N ( y i,: 0, C ), i=1 C = WW + σ 2 I

35 Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y W) = n N ( y i,: 0, C ), i=1 C = WW + σ 2 I log p (Y W) = n 2 log C 1 2 tr ( C 1 Y Y ) + const.

36 Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y W) = n N ( y i,: 0, C ), i=1 C = WW + σ 2 I log p (Y W) = n 2 log C 1 2 tr ( C 1 Y Y ) + const. If U q are first q principal eigenvectors of n 1 Y Y and the corresponding eigenvalues are Λ q,

37 Linear Latent Variable Model II Probabilistic PCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y W) = n N ( y i,: 0, C ), i=1 C = WW + σ 2 I log p (Y W) = n 2 log C 1 2 tr ( C 1 Y Y ) + const. If U q are first q principal eigenvectors of n 1 Y Y and the corresponding eigenvalues are Λ q, W = U q LR, L = ( Λ q σ 2 I ) 1 2 where R is an arbitrary rotation matrix.

38 Outline Motivating Example Linear Dimensionality Reduction Non-linear Dimensionality Reduction

39 Difficulty for Probabilistic Approaches Propagate a probability distribution through a non-linear mapping. Normalisation of distribution becomes intractable. x2 y j = f j (x) x 1 Figure: A three dimensional manifold formed by mapping from a two dimensional space to a three dimensional space.

40 Difficulty for Probabilistic Approaches x y 1 = f 1 (x) y 2 = f 2 (x) y2 y 1 Figure: A string in two dimensions, formed by mapping from one dimension, x, line to a two dimensional space, [y 1, y 2 ] using nonlinear functions f 1 ( ) and f 2 ( ).

41 Difficulty for Probabilistic Approaches y = f (x) + ɛ p(x) p(y) Figure: A Gaussian distribution propagated through a non-linear mapping. y i = f (x i ) + ɛ i. ɛ N ( 0, 0.2 2) and f ( ) uses RBF basis, 100 centres between -4 and 4 and l = 0.1. New distribution over y (right) is multimodal and difficult to normalize.

42 Linear Latent Variable Model III Dual Probabilistic PCA Define linear-gaussian relationship between latent variables and data. W Y X σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1

43 Linear Latent Variable Model III Dual Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Novel Latent variable approach: W Y X σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1

44 Linear Latent Variable Model III Dual Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Novel Latent variable approach: Define Gaussian prior over parameters, W. W X Y σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1 p p (W) = N ( w i,: 0, I ) i=1

45 Linear Latent Variable Model III Dual Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Novel Latent variable approach: Define Gaussian prior over parameters, W. Integrate out parameters. W X Y σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1 p p (W) = N ( w i,: 0, I ) i=1 p p (Y X) = N ( y :,j 0, XX + σ 2 I ) j=1

46 Computation of the Marginal Likelihood y :,j = Xw :,j +ɛ :,j, w :,j N (0, I), ɛ i,: N ( 0, σ 2 I )

47 Computation of the Marginal Likelihood y :,j = Xw :,j +ɛ :,j, w :,j N (0, I), ɛ i,: N ( 0, σ 2 I ) Xw :,j N ( 0, XX ),

48 Computation of the Marginal Likelihood y :,j = Xw :,j +ɛ :,j, w :,j N (0, I), ɛ i,: N ( 0, σ 2 I ) Xw :,j N ( 0, XX ), Xw :,j + ɛ :,j N ( 0, XX + σ 2 I )

49 Linear Latent Variable Model IV Dual Probabilistic PCA Max. Likelihood Soln (Lawrence, 2004, 2005) X Y σ 2 p (Y X) = p N ( y :,j 0, XX + σ 2 I ) j=1

50 Linear Latent Variable Model IV Dual PPCA Max. Likelihood Soln (Lawrence, 2004, 2005) p (Y X) = p N ( y :,j 0, K ), j=1 K = XX + σ 2 I

51 Linear Latent Variable Model IV PPCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y X) = p N ( y :,j 0, K ), j=1 K = XX + σ 2 I log p (Y X) = p 2 log K 1 2 tr ( K 1 YY ) + const.

52 Linear Latent Variable Model IV PPCA Max. Likelihood Soln p (Y X) = p N ( y :,j 0, K ), j=1 K = XX + σ 2 I log p (Y X) = p 2 log K 1 2 tr ( K 1 YY ) + const. If U q are first q principal eigenvectors of p 1 YY and the corresponding eigenvalues are Λ q,

53 Linear Latent Variable Model IV PPCA Max. Likelihood Soln p (Y X) = p N ( y :,j 0, K ), j=1 K = XX + σ 2 I log p (Y X) = p 2 log K 1 2 tr ( K 1 YY ) + const. If U q are first q principal eigenvectors of p 1 YY and the corresponding eigenvalues are Λ q, X = U qlr, L = ( Λ q σ 2 I ) 1 2 where R is an arbitrary rotation matrix.

54 Linear Latent Variable Model IV Dual PPCA Max. Likelihood Soln (Lawrence, 2004, 2005) p (Y X) = p N ( y :,j 0, K ), j=1 K = XX + σ 2 I log p (Y X) = p 2 log K 1 2 tr ( K 1 YY ) + const. If U q are first q principal eigenvectors of p 1 YY and the corresponding eigenvalues are Λ q, X = U qlr, L = ( Λ q σ 2 I ) 1 2 where R is an arbitrary rotation matrix.

55 Linear Latent Variable Model IV PPCA Max. Likelihood Soln (Tipping and Bishop, 1999) p (Y W) = n N ( y i,: 0, C ), i=1 C = WW + σ 2 I log p (Y W) = n 2 log C 1 2 tr ( C 1 Y Y ) + const. If U q are first q principal eigenvectors of n 1 Y Y and the corresponding eigenvalues are Λ q, W = U q LR, L = ( Λ q σ 2 I ) 1 2 where R is an arbitrary rotation matrix.

56 Equivalence of Formulations The Eigenvalue Problems are equivalent Solution for Probabilistic PCA (solves for the mapping) Y YU q = U q Λ q W = U q LR Solution for Dual Probabilistic PCA (solves for the latent positions) YY U q = U qλ q X = U qlr Equivalence is from U q = Y U qλ 1 2 q

57 Non-Linear Latent Variable Model Dual Probabilistic PCA Define linear-gaussian relationship between latent variables and data. Novel Latent variable approach: Define Gaussian prior over parameteters, W. Integrate out parameters. W X Y σ 2 n p (Y X, W) = N ( y i,: Wx i,:, σ 2 I ) i=1 p p (W) = N ( w i,: 0, I ) i=1 p p (Y X) = N ( y :,j 0, XX + σ 2 I ) j=1

58 Non-Linear Latent Variable Model Dual Probabilistic PCA Inspection of the marginal likelihood shows... W X Y σ 2 p p (Y X) = N ( y :,j 0, XX + σ 2 I ) j=1

59 Non-Linear Latent Variable Model Dual Probabilistic PCA Inspection of the marginal likelihood shows... The covariance matrix is a covariance function. W Y X σ 2 p p (Y X) = N ( y :,j 0, K ) j=1 K = XX + σ 2 I

60 Non-Linear Latent Variable Model Dual Probabilistic PCA Inspection of the marginal likelihood shows... The covariance matrix is a covariance function. We recognise it as the linear kernel. W X Y p p (Y X) = N ( y :,j 0, K ) j=1 K = XX + σ 2 I σ 2 This is a product of Gaussian processes with linear kernels.

61 Non-Linear Latent Variable Model Dual Probabilistic PCA Inspection of the marginal likelihood shows... The covariance matrix is a covariance function. We recognise it as the linear kernel. We call this the Gaussian Process Latent Variable model (GP-LVM). W X Y σ 2 p p (Y X) = N ( y :,j 0, K ) j=1 K =? Replace linear kernel with non-linear kernel for non-linear model.

62 Non-linear Latent Variable Models Exponentiated Quadratic (EQ) Covariance The EQ covariance has the form k i,j = k ( x i,:, x j,: ), where k ( ) 2 x i,: x j,: x i,:, x j,: = α exp 2 2l 2. No longer possible to optimise wrt X via an eigenvalue problem. Instead find gradients with respect to X, α, l and σ 2 and optimise using conjugate gradients.

63 Applications Style Based Inverse Kinematics Facilitating animation through modeling human motion (Grochow et al., 2004) Tracking Tracking using human motion models (Urtasun et al., 2005, 2006) Assisted Animation Generalizing drawings for animation (Baxter and Anjyo, 2006) Shape Models Inferring shape (e.g. pose from silhouette). (Ek et al., 2008b,a; Priacuriu and Reid, 2011a,b)

64 Stick Man Generalization with less Data than Dimensions Powerful uncertainly handling of GPs leads to surprising properties. Non-linear models can be used where there are fewer data points than dimensions without overfitting. Example: Modelling a stick man in 102 dimensions with 55 data points!

65 Stick Man II demstick Figure: The latent space for the stick man motion capture data.

66 Selecting Data Dimensionality GP-LVM Provides probabilistic non-linear dimensionality reduction. How to select the dimensionality? Need to estimate marginal likelihood. In standard GP-LVM it increases with increasing q.

67 Integrate Mapping Function and Latent Variables Bayesian GP-LVM Start with a standard GP-LVM. X Y σ 2 p p (Y X) = N ( y :,j 0, K ) j=1

68 Integrate Mapping Function and Latent Variables Bayesian GP-LVM Start with a standard GP-LVM. Apply standard latent variable approach: Define Gaussian prior over latent space, X. X Y σ 2 p p (Y X) = N ( y :,j 0, K ) j=1

69 Integrate Mapping Function and Latent Variables Bayesian GP-LVM Start with a standard GP-LVM. Apply standard latent variable approach: Define Gaussian prior over latent space, X. Integrate out latent variables. X Y σ 2 p p (Y X) = N ( y :,j 0, K ) j=1 p (X) = q j=1 N ( x :,j 0, α 2 i I )

70 Integrate Mapping Function and Latent Variables Bayesian GP-LVM Start with a standard GP-LVM. Apply standard latent variable approach: Define Gaussian prior over latent space, X. Integrate out latent variables. Unfortunately integration is intractable. X Y p (Y X) = p (X) = q j=1 σ 2 p N ( y :,j 0, K ) j=1 N ( x :,j 0, α 2 i I ) p (Y α) =??

71 Standard Variational Approach Fails Standard variational bound has the form: L = log p(y X) q(x) + KL ( q(x) p(x) )

72 Standard Variational Approach Fails Standard variational bound has the form: L = log p(y X) q(x) + KL ( q(x) p(x) ) Requires expectation of log p(y X) under q(x). log p(y X) = 1 ( 2 y K f,f + σ 2 I ) 1 1 y 2 log Kf,f + σ 2 I n log 2π 2

73 Standard Variational Approach Fails Standard variational bound has the form: L = log p(y X) q(x) + KL ( q(x) p(x) ) Requires expectation of log p(y X) under q(x). log p(y X) = 1 ( 2 y K f,f + σ 2 I ) 1 1 y 2 log Kf,f + σ 2 I n log 2π 2 Extremely difficult to compute because K f,f is dependent on X and appears in the inverse.

74 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y) n c i i=1 N ( y f, σ 2 I ) p(u)du

75 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y X) n c i i=1 N ( y f p(f u,x), σ 2 I ) p(u)du

76 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y X)p(X)dX n c i N ( y f p(f u,x), σ 2 I ) p(x)dxp(u)du i=1

77 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y X)p(X)dX n c i N ( y f p(f u,x), σ 2 I ) p(x)dxp(u)du i=1 Apply variational lower bound to the inner integral.

78 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y X)p(X)dX n c i N ( y f p(f u,x), σ 2 I ) p(x)dxp(u)du i=1 Apply variational lower bound to the inner integral. n c i N ( y f p(f u,x), σ 2 I ) p(x)dx i=1 n log c i i=1 q(x) + log N ( y f p(f u,x), σ 2 I ) q(x) + KL ( q(x) p(x) )

79 Variational Bayesian GP-LVM Consider collapsed variational bound, p(y X)p(X)dX n c i N ( y f p(f u,x), σ 2 I ) p(x)dxp(u)du i=1 Apply variational lower bound to the inner integral. n c i N ( y f p(f u,x), σ 2 I ) p(x)dx i=1 n log c i i=1 q(x) + log N ( y f p(f u,x), σ 2 I ) q(x) + KL ( q(x) p(x) ) Which is analytically tractable for Gaussian q(x) and some covariance functions.

80 Required Expectations Need expectations under q(x) of: and log c i = 1 [ ] ki,i 2σ 2 k i,u K 1 u,uk i,u log N ( y f p(f u,y), σ 2 I ) = 1 2 log 2πσ2 1 2σ 2 ( yi K f,u K 1 u,uu ) 2 This requires the expectations Kf,u and q(x) Kf,u K 1 u,uk u,f q(x) which can be computed analytically for some covariance functions.

81 Priors for Latent Space Titsias and Lawrence (2010) Variational marginalization of X allows us to learn parameters of p(x). Standard GP-LVM where X learnt by MAP, this is not possible (see e.g. Wang et al., 2008). First example: learn the dimensionality of latent space.

82 Graphical Representations of GP-LVM X latent space Y data space

83 Graphical Representations of GP-LVM x 1 x 2 x 3 x 4 x 5 x 6 latent space y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 data space

84 Graphical Representations of GP-LVM x 1 x 2 x 3 x 4 x 5 x 6 latent space y 1 data space

85 Graphical Representations of GP-LVM x 1 x 2 x 3 x 4 x 5 x 6 latent space σ 2 y w data space

86 Graphical Representations of GP-LVM x 1 x 2 x 3 x 4 x 5 x 6 latent space σ 2 y w α data space w N (0, αi) x N (0, I) y N ( x w, σ 2)

87 Graphical Representations of GP-LVM α x 1 x 2 x 3 x 4 x 5 x 6 latent space σ 2 y w data space w N (0, I) x N (0, αi) y N ( x w, σ 2)

88 Graphical Representations of GP-LVM α 1 α 2 α 3 α 4 α 5 α 6 x 1 x 2 x 3 x 4 x 5 x 6 latent space σ 2 y w data space w N (0, I) x i N (0, α i ) y N ( x w, σ 2)

89 Graphical Representations of GP-LVM α 1 α 2 α 3 α 4 α 5 α 6 w 1 w 2 w 3 w 4 w 5 w 6 latent space σ 2 y x data space w i N (0, α i ) x N (0, I) y N ( x w, σ 2)

90 Non-linear f (x) In linear case equivalence because f (x) = w x p(w i ) N (0, α i ) In non linear case, need to scale columns of X in prior for f (x). This implies scaling columns of X in covariance function ( k(x i,:, x j,: ) = exp 1 ) 2 (x :,i x :,j ) A(x :,i x :,j ) A is diagonal with elements α 2. Now keep prior spherical i p (X) = q N ( x :,j 0, I ) j=1 Covariance functions of this type are known as ARD (see e.g. Neal, 1996; MacKay, 2003; Rasmussen and Williams, 2006).

91 Other Priors on X Dynamical prior gives us Gaussian process dynamical system (Wang et al., 2006; Damianou et al., 2011) Structured learning prior gives us (soft) manifold sharing (Shon et al., 2006; Navaratnam et al., 2007; Ek et al., 2008b,a; Damianou et al., 2012) Gaussian process prior gives us Deep Gaussian Processes (Lawrence and Moore, 2007; Damianou and Lawrence, 2013)

92 References I W. V. Baxter and K.-I. Anjyo. Latent doodle space. In EUROGRAPHICS, volume 25, pages , Vienna, Austria, September A. Damianou, C. H. Ek, M. K. Titsias, and N. D. Lawrence. Manifold relevance determination. In J. Langford and J. Pineau, editors, Proceedings of the International Conference in Machine Learning, volume 29, San Francisco, CA, Morgan Kauffman. [PDF]. A. Damianou and N. D. Lawrence. Deep Gaussian processes. In C. Carvalho and P. Ravikumar, editors, Proceedings of the Sixteenth International Workshop on Artificial Intelligence and Statistics, volume 31, pages , AZ, USA, JMLR W&CP 31. [PDF]. A. Damianou, M. K. Titsias, and N. D. Lawrence. Variational Gaussian process dynamical systems. In P. Bartlett, F. Peirrera, C. K. I. Williams, and J. Lafferty, editors, Advances in Neural Information Processing Systems, volume 24, Cambridge, MA, MIT Press. [PDF]. C. H. Ek, J. Rihan, P. H. S. Torr, G. Rogez, and N. D. Lawrence. Ambiguity modeling in latent spaces. In A. Popescu-Belis and R. Stiefelhagen, editors, Machine Learning for Multimodal Interaction (MLMI 2008), LNCS, pages Springer-Verlag, June 2008a. [PDF]. C. H. Ek, P. H. S. Torr, and N. D. Lawrence. Gaussian process latent variable models for human pose estimation. In A. Popescu-Belis, S. Renals, and H. Bourlard, editors, Machine Learning for Multimodal Interaction (MLMI 2007), volume 4892 of LNCS, pages , Brno, Czech Republic, 2008b. Springer-Verlag. [PDF]. K. Grochow, S. L. Martin, A. Hertzmann, and Z. Popovic. Style-based inverse kinematics. In ACM Transactions on Graphics (SIGGRAPH 2004), pages , N. D. Lawrence. Gaussian process models for visualisation of high dimensional data. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems, volume 16, pages , Cambridge, MA, MIT Press. N. D. Lawrence. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. Journal of Machine Learning Research, 6: , N. D. Lawrence and A. J. Moore. Hierarchical Gaussian process latent variable models. In Z. Ghahramani, editor, Proceedings of the International Conference in Machine Learning, volume 24, pages Omnipress, [Google Books]. [PDF].

93 References II D. J. C. MacKay. Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge, U.K., [Google Books]. R. Navaratnam, A. Fitzgibbon, and R. Cipolla. The joint manifold model for semi-supervised multi-valued regression. In IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society Press, R. M. Neal. Bayesian Learning for Neural Networks. Springer, Lecture Notes in Statistics 118. V. Priacuriu and I. D. Reid. Nonlinear shape manifolds as shape priors in level set segmentation and tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011a. V. Priacuriu and I. D. Reid. Shared shape spaces. In IEEE International Conference on Computer Vision (ICCV), 2011b. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA, [Google Books]. A. P. Shon, K. Grochow, A. Hertzmann, and R. P. N. Rao. Learning shared latent structure for image synthesis and robotic imitation. In Weiss et al. (2006). M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society, B, 6 (3): , [PDF]. [DOI]. M. K. Titsias and N. D. Lawrence. Bayesian Gaussian process latent variable model. In Y. W. Teh and D. M. Titterington, editors, Proceedings of the Thirteenth International Workshop on Artificial Intelligence and Statistics, volume 9, pages , Chia Laguna Resort, Sardinia, Italy, May JMLR W&CP 9. [PDF]. R. Urtasun, D. J. Fleet, and P. Fua. 3D people tracking with Gaussian process dynamical models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages , New York, U.S.A., Jun IEEE Computer Society Press. R. Urtasun, D. J. Fleet, A. Hertzmann, and P. Fua. Priors for people tracking from small training sets. In IEEE International Conference on Computer Vision (ICCV), pages , Bejing, China, Oct IEEE Computer Society Press. J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian process dynamical models. In Weiss et al. (2006). J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2): , ISSN [DOI]. Y. Weiss, B. Schölkopf, and J. C. Platt, editors. Advances in Neural Information Processing Systems, volume 18, Cambridge, MA, MIT Press.

Non Linear Latent Variable Models

Non Linear Latent Variable Models Non Linear Latent Variable Models Neil Lawrence GPRS 14th February 2014 Outline Nonlinear Latent Variable Models Extensions Outline Nonlinear Latent Variable Models Extensions Non-Linear Latent Variable

More information

Visualization with Gaussian Processes

Visualization with Gaussian Processes Visualization with Gaussian Processes Neil Lawrence BioPreDyn Course, EBI Cambridge, 13th May 2014 Outline Motivation Nonlinear Latent Variable Models Single Cell Data Extensions Outline Motivation Nonlinear

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Raquel Urtasun TTI Chicago August 2, 2013 R. Urtasun (TTIC) Gaussian Processes August 2, 2013 1 / 59 Motivation for Non-Linear Dimensionality Reduction USPS Data Set

More information

Deep Gaussian Processes

Deep Gaussian Processes Deep Gaussian Processes Neil D. Lawrence 30th April 2015 KTH Royal Institute of Technology Outline Introduction Deep Gaussian Process Models Variational Approximation Samples and Results Outline Introduction

More information

All you want to know about GPs: Linear Dimensionality Reduction

All you want to know about GPs: Linear Dimensionality Reduction All you want to know about GPs: Linear Dimensionality Reduction Raquel Urtasun and Neil Lawrence TTI Chicago, University of Sheffield June 16, 2012 Urtasun & Lawrence () GP tutorial June 16, 2012 1 / 40

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing

More information

Deep Gaussian Processes

Deep Gaussian Processes Deep Gaussian Processes Neil D. Lawrence 8th April 2015 Mascot Num 2015 Outline Introduction Deep Gaussian Process Models Variational Methods Composition of GPs Results Outline Introduction Deep Gaussian

More information

Models. Carl Henrik Ek Philip H. S. Torr Neil D. Lawrence. Computer Science Departmental Seminar University of Bristol October 25 th, 2007

Models. Carl Henrik Ek Philip H. S. Torr Neil D. Lawrence. Computer Science Departmental Seminar University of Bristol October 25 th, 2007 Carl Henrik Ek Philip H. S. Torr Neil D. Lawrence Oxford Brookes University University of Manchester Computer Science Departmental Seminar University of Bristol October 25 th, 2007 Source code and slides

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Latent Force Models. Neil D. Lawrence

Latent Force Models. Neil D. Lawrence Latent Force Models Neil D. Lawrence (work with Magnus Rattray, Mauricio Álvarez, Pei Gao, Antti Honkela, David Luengo, Guido Sanguinetti, Michalis Titsias, Jennifer Withers) University of Sheffield University

More information

Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling

Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling Nakul Gopalan IAS, TU Darmstadt nakul.gopalan@stud.tu-darmstadt.de Abstract Time series data of high dimensions

More information

Multivariate Bayesian Linear Regression MLAI Lecture 11

Multivariate Bayesian Linear Regression MLAI Lecture 11 Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Thang D. Bui Richard E. Turner tdb40@cam.ac.uk ret26@cam.ac.uk Computational and Biological Learning

More information

Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression

Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression Michalis K. Titsias Department of Informatics Athens University of Economics and Business mtitsias@aueb.gr Miguel Lázaro-Gredilla

More information

Variational Dependent Multi-output Gaussian Process Dynamical Systems

Variational Dependent Multi-output Gaussian Process Dynamical Systems Variational Dependent Multi-output Gaussian Process Dynamical Systems Jing Zhao and Shiliang Sun Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

Missing Data in Kernel PCA

Missing Data in Kernel PCA Missing Data in Kernel PCA Guido Sanguinetti, Neil D. Lawrence Department of Computer Science, University of Sheffield 211 Portobello Street, Sheffield S1 4DP, U.K. 19th June 26 Abstract Kernel Principal

More information

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/

More information

Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes

Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes Journal of Machine Learning Research 17 (2016) 1-62 Submitted 9/14; Revised 7/15; Published 4/16 Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes Andreas C. Damianou

More information

Variational Gaussian Process Dynamical Systems

Variational Gaussian Process Dynamical Systems Variational Gaussian Process Dynamical Systems Andreas C. Damianou Department of Computer Science University of Sheffield, UK andreas.damianou@sheffield.ac.uk Michalis K. Titsias School of Computer Science

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probabilistic & Unsupervised Learning Gaussian Processes Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College London

More information

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University

More information

Neutron inverse kinetics via Gaussian Processes

Neutron inverse kinetics via Gaussian Processes Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques

More information

Expectation Propagation in Dynamical Systems

Expectation Propagation in Dynamical Systems Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1 Motivation Figure : Complex

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Neil D. Lawrence neill@cs.man.ac.uk Mathematics for Data Modelling University of Sheffield January 23rd 28 Neil Lawrence () Dimensionality Reduction Data Modelling School 1 / 7

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

Probabilistic Models for Learning Data Representations. Andreas Damianou

Probabilistic Models for Learning Data Representations. Andreas Damianou Probabilistic Models for Learning Data Representations Andreas Damianou Department of Computer Science, University of Sheffield, UK IBM Research, Nairobi, Kenya, 23/06/2015 Sheffield SITraN Outline Part

More information

Gaussian processes autoencoder for dimensionality reduction

Gaussian processes autoencoder for dimensionality reduction Gaussian processes autoencoder for dimensionality reduction Conference or Workshop Item Accepted Version Jiang, X., Gao, J., Hong, X. and Cai,. (2014) Gaussian processes autoencoder for dimensionality

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

arxiv: v1 [stat.ml] 8 Sep 2014

arxiv: v1 [stat.ml] 8 Sep 2014 VARIATIONAL GP-LVM Variational Inference for Uncertainty on the Inputs of Gaussian Process Models arxiv:1409.2287v1 [stat.ml] 8 Sep 2014 Andreas C. Damianou Dept. of Computer Science and Sheffield Institute

More information

INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES

INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, China E-MAIL: slsun@cs.ecnu.edu.cn,

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

System identification and control with (deep) Gaussian processes. Andreas Damianou

System identification and control with (deep) Gaussian processes. Andreas Damianou System identification and control with (deep) Gaussian processes Andreas Damianou Department of Computer Science, University of Sheffield, UK MIT, 11 Feb. 2016 Outline Part 1: Introduction Part 2: Gaussian

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models

Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models Journal of Machine Learning Research 6 (25) 1783 1816 Submitted 4/5; Revised 1/5; Published 11/5 Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models Neil

More information

Gaussian Process Latent Random Field

Gaussian Process Latent Random Field Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Gaussian Process Latent Random Field Guoqiang Zhong, Wu-Jun Li, Dit-Yan Yeung, Xinwen Hou, Cheng-Lin Liu National Laboratory

More information

Gaussian process for nonstationary time series prediction

Gaussian process for nonstationary time series prediction Computational Statistics & Data Analysis 47 (2004) 705 712 www.elsevier.com/locate/csda Gaussian process for nonstationary time series prediction Soane Brahim-Belhouari, Amine Bermak EEE Department, Hong

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Deep Gaussian Processes and Variational Propagation of Uncertainty

Deep Gaussian Processes and Variational Propagation of Uncertainty Deep Gaussian Processes and Variational Propagation of Uncertainty Andreas Damianou Department of Neuroscience University of Sheffield This dissertation is submitted for the degree of Doctor of Philosophy

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

What, exactly, is a cluster? - Bernhard Schölkopf, personal communication

What, exactly, is a cluster? - Bernhard Schölkopf, personal communication Chapter 1 Warped Mixture Models What, exactly, is a cluster? - Bernhard Schölkopf, personal communication Previous chapters showed how the probabilistic nature of GPs sometimes allows the automatic determination

More information

Intelligent Systems I

Intelligent Systems I Intelligent Systems I 00 INTRODUCTION Stefan Harmeling & Philipp Hennig 24. October 2013 Max Planck Institute for Intelligent Systems Dptmt. of Empirical Inference Which Card? Opening Experiment Which

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Dimensional reduction of clustered data sets

Dimensional reduction of clustered data sets Dimensional reduction of clustered data sets Guido Sanguinetti 5th February 2007 Abstract We present a novel probabilistic latent variable model to perform linear dimensional reduction on data sets which

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College

More information

Dimensionality Reduction by Unsupervised Regression

Dimensionality Reduction by Unsupervised Regression Dimensionality Reduction by Unsupervised Regression Miguel Á. Carreira-Perpiñán, EECS, UC Merced http://faculty.ucmerced.edu/mcarreira-perpinan Zhengdong Lu, CSEE, OGI http://www.csee.ogi.edu/~zhengdon

More information

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Explicit Rates of Convergence for Sparse Variational Inference in Gaussian Process Regression

Explicit Rates of Convergence for Sparse Variational Inference in Gaussian Process Regression JMLR: Workshop and Conference Proceedings : 9, 08. Symposium on Advances in Approximate Bayesian Inference Explicit Rates of Convergence for Sparse Variational Inference in Gaussian Process Regression

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Multioutput, Multitask and Mechanistic

Multioutput, Multitask and Mechanistic Multioutput, Multitask and Mechanistic Neil D. Lawrence and Raquel Urtasun CVPR 16th June 2012 Urtasun and Lawrence () Session 2: Multioutput, Multitask, Mechanistic CVPR Tutorial 1 / 69 Outline 1 Kalman

More information

Modelling Transcriptional Regulation with Gaussian Processes

Modelling Transcriptional Regulation with Gaussian Processes Modelling Transcriptional Regulation with Gaussian Processes Neil Lawrence School of Computer Science University of Manchester Joint work with Magnus Rattray and Guido Sanguinetti 8th March 7 Outline Application

More information

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection (non-examinable material) Matthew J. Beal February 27, 2004 www.variational-bayes.org Bayesian Model Selection

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

Multi-Kernel Gaussian Processes

Multi-Kernel Gaussian Processes Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Arman Melkumyan Australian Centre for Field Robotics School of Aerospace, Mechanical and Mechatronic Engineering

More information

DD Advanced Machine Learning

DD Advanced Machine Learning Modelling Carl Henrik {chek}@csc.kth.se Royal Institute of Technology November 4, 2015 Who do I think you are? Mathematically competent linear algebra multivariate calculus Ok programmers Able to extend

More information

Variational Model Selection for Sparse Gaussian Process Regression

Variational Model Selection for Sparse Gaussian Process Regression Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of Computer Science University of Manchester 7 September 2008 Outline Gaussian process regression and sparse

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN , Sparse Kernel Canonical Correlation Analysis Lili Tan and Colin Fyfe 2, Λ. Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong. 2. School of Information and Communication

More information

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures

More information

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Nonparametric Inference for Auto-Encoding Variational Bayes

Nonparametric Inference for Auto-Encoding Variational Bayes Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an

More information

Variational Autoencoders

Variational Autoencoders Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly

More information

arxiv: v4 [stat.ml] 11 Apr 2016

arxiv: v4 [stat.ml] 11 Apr 2016 Manifold Gaussian Processes for Regression Roberto Calandra, Jan Peters, Carl Edward Rasmussen and Marc Peter Deisenroth Intelligent Autonomous Systems Lab, Technische Universität Darmstadt, Germany Max

More information

Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005

Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005 Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005 Presented by Piotr Mirowski CBLL meeting, May 6, 2009 Courant Institute of Mathematical Sciences, New York University

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Approximate Kernel Methods

Approximate Kernel Methods Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression

More information

Probabilistic & Bayesian deep learning. Andreas Damianou

Probabilistic & Bayesian deep learning. Andreas Damianou Probabilistic & Bayesian deep learning Andreas Damianou Amazon Research Cambridge, UK Talk at University of Sheffield, 19 March 2019 In this talk Not in this talk: CRFs, Boltzmann machines,... In this

More information

Smooth Bayesian Kernel Machines

Smooth Bayesian Kernel Machines Smooth Bayesian Kernel Machines Rutger W. ter Borg 1 and Léon J.M. Rothkrantz 2 1 Nuon NV, Applied Research & Technology Spaklerweg 20, 1096 BA Amsterdam, the Netherlands rutger@terborg.net 2 Delft University

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II 1 Non-linear regression techniques Part - II Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Goal (Lecture): To present Probabilistic Principal Component Analysis (PPCA) using both Maximum Likelihood (ML) and Expectation Maximization

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Talk on Bayesian Optimization

Talk on Bayesian Optimization Talk on Bayesian Optimization Jungtaek Kim (jtkim@postech.ac.kr) Machine Learning Group, Department of Computer Science and Engineering, POSTECH, 77-Cheongam-ro, Nam-gu, Pohang-si 37673, Gyungsangbuk-do,

More information

Reliability Monitoring Using Log Gaussian Process Regression

Reliability Monitoring Using Log Gaussian Process Regression COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical

More information

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception COURSE INTRODUCTION COMPUTATIONAL MODELING OF VISUAL PERCEPTION 2 The goal of this course is to provide a framework and computational tools for modeling visual inference, motivated by interesting examples

More information

Further Issues and Conclusions

Further Issues and Conclusions Chapter 9 Further Issues and Conclusions In the previous chapters of the book we have concentrated on giving a solid grounding in the use of GPs for regression and classification problems, including model

More information

Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Gaussian Processes in Machine Learning November 17, 2011 CharmGil Hong Agenda Motivation GP : How does it make sense? Prior : Defining a GP More about Mean and Covariance Functions Posterior : Conditioning

More information

Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model

Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model Kai Yu Joint work with John Lafferty and Shenghuo Zhu NEC Laboratories America, Carnegie Mellon University First Prev Page

More information

Multiple-step Time Series Forecasting with Sparse Gaussian Processes

Multiple-step Time Series Forecasting with Sparse Gaussian Processes Multiple-step Time Series Forecasting with Sparse Gaussian Processes Perry Groot ab Peter Lucas a Paul van den Bosch b a Radboud University, Model-Based Systems Development, Heyendaalseweg 135, 6525 AJ

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Non-linear Bayesian Image Modelling

Non-linear Bayesian Image Modelling Non-linear Bayesian Image Modelling Christopher M. Bishop 1 and John M. Winn 2 1 Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/

More information

Metrics for Probabilistic Geometries

Metrics for Probabilistic Geometries Metrics for Probabilistic Geometries Alessandra Tosi Dept. of Computer Science Universitat Politècnica de Catalunya Barcelona, Spain Søren Hauberg DTU Compute Technical University of Denmark Denmark Alfredo

More information