Monitoring Wafer Geometric Quality using Additive Gaussian Process

Size: px
Start display at page:

Download "Monitoring Wafer Geometric Quality using Additive Gaussian Process"

Transcription

1 Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department of Industrial Engineering, Tsinghua University May 23, 2013

2 Outline 1 Introduction 2 Statistical Quantification using AGP Model 3 Statistical Monitoring of Geometric Quality 4 Case Studies 5 Conclusion and Future Directions

3 Motivation Integrated Circuits 3 / 42

4 Motivation Semiconductor Fabrication Process Ingot Slicing Lapping Polishing Cleaning Wafer Inspection Reject Disposal Accept Front End Back End Chips 4 / 42

5 Motivation Challenges Transistor size: 32nm 28nm 22nm 16nm 14nm Wafer size: 130mm 150mm 200mm 300mm 450mm 5 / 42

6 Motivation Wafer Preparation Process IC Companies Higher Integration Wafer Fabs Larger Diameter Require Cause Good Wafer Quality Bad 6 / 42

7 Introduction AGP Model Statistical Testing Case Studies Conclusion References Motivation Wafer s Geometric Quality Contact method: touching probes; Non-contact method: wavelength scanning interferometer; Measurements 60 Thicker 40 x Thinner x Engineers problem: how to check whether the surface is desirable? 7 / 42

8 Motivation Testing Problem / 42

9 Problem formulation Framework Surface as the Response Variable Modeling Monitoring Process Control Without covariate Regression with covariates Design optimization Change detection Design optimization Run-to-Run control Fault diagnostics 9 / 42

10 Problem formulation Difficulties Complete measurement of the wafer is slow Geometric profile is too complex to be modeled by parametric functions Measurements on different surfaces might not be aligned well Deviations (errors) are spatially correlated 10 / 42

11 Problem formulation State of the art One sample model: Gaussian process (Jin, Chang, and Shi 2012), PDE-constrained Gaussian process (Zhao, Jin, Wu, and Shi 2011) Only applicable for a single surface Primitive testing: summary indicators of the whole profile Total Thickness Variation (TTV), Bow, Warp, Site TIR (Doering and Nishi 2007); Need to fill in the gap 11 / 42

12 Review of GP Gaussian Process Y (x) = µ + Z(x) with PD covariance function k(x i, x j ) Suitable for spatially correlated data (Cressie 1993); Able to approximate complex function (Sacks et al. 1989); Able to evaluate prediction error (Santner et al. 2003) Prediction Sample True Function MSE of Prediction / 42

13 Review of GP Gaussian Process with Errors Errors present in physcial processes or stochastic simulations Y (x) = µ + Z(x) + ɛ(x) ɛ(x i ) are i.i.d. normally distributed: Σ + σ 2 I ɛ(x i ) are independently and normally distributed, but var(ɛ(x i )) = σ 2 (x i ): Σ + Λ (Ankenman et al. 2010) ɛ(x i ) are correlated, then? Cycle time estimation 4 50th Quantile Regression Curve 85th Predicted Mean Samples Standard Cycle Time Quantile Cycle Time Quantile Throughput x 011 Fig. 5. G/G/1 quantile regression curve with empirical quantile estimates. 13 / 42

14 AGP Model Data Characteristics Profile Value (x 21, y 22 ) (x 11, y 11 ) (x 21, y 22 ) (x 12, y 12 ) f (x) + ɛ 1 (x) f (x) f (x) + ɛ 2 (x) Location 14 / 42

15 AGP Model AGP Model Y i (x) = f (x) + ɛ i (x) Standard surface Deviation surface Assumption f (x) is a realization of GP(µ, s( )) ɛ i (x) is a realization of GP(0, v( )) f (x) and ɛ i (x) are independent ɛ i (x) and ɛ j (x) are independent for i j 15 / 42

16 AGP Model Distributional view A Gaussian process can be used as a prior probability distribution over functions in Bayesian inference (Rasmussen and Williams 2006). 1.5 Realization 1 Realization 2 Generated Value x Linear model: Y (x) = f (x) + ɛ AGP model: Y (x) = f (x) + ɛ(x) i.i.d F ɛ i.i.d GP(0, v( )) 16 / 42

17 AGP Model Model Estimation Estimate the model parameters β [µ, σ 2 1, θ 1, σ 2 2, θ 2] from observations Profile Value (x 21, y 22 ) (x 11, y 11 ) (x 21, y 22 ) (x 12, y 12 ) f (x) + ɛ 1 (x) f (x) f (x) + ɛ 2 (x) Location 17 / 42

18 AGP Model Structure of Σ 0 { s(xij, x i k) + v(x ij, x i k), i = i cov(y ij, y i k) = s(x ij, x i k), i i i, i = 1, 2,, N 0 X IC X 1 X 2 X N0 X 1 n 1 n 1 0 X IC M 0 M 0 + X 2 n 2 n 2 X N0 0 n N0 n N0 s(x ij, x i k θ 1 ) v(x ij, x i k θ 2 ) 18 / 42

19 AGP Model MLE Given the data from all surface profiles X IC, Y IC, we can estimate β as { ˆβ = arg max 1 β 2 log[det(σ2 1S + σ2v)] 2 1 } 2 (Y IC µ1 M0 ) T (σ1s 2 + σ2v) 2 1 (Y IC µ1 M0 ). Maximizing profile likelihood: given θ 1, θ 2, the correlation matrix S, V are fixed. Then µ, σ1 2, σ2 2 can be obtained easily. µ = 1T M 0 (S + ρv) 1 Y IC 1 T M 0 (S + ρv) 1 1 M0, ρ = σ 2 2/σ 2 1 σ 2 1 = (Y IC µ1 M0 ) T (S + ρv) 1 (Y IC µ1 M0 ) M 0 19 / 42

20 AGP Model Prediction For new unmeasured site (X l, Y l ): ( ) [( ) ( ) ] Yl µ1nl Σl Σ l,0 N, Σ T l,0 Σ 0 Y IC µ1 M0 Y l Y IC N( µ l, Σ l ), where µ l = µ1 nl + Σ l,0 Σ 1 0 (Y IC µ1 M0 ) Σ l = Σ l Σ l,0 Σ 1 0 ΣT l,0 Σ l,0 may have a different form depending on whether Y l are taken from existing profiles or new ones. 20 / 42

21 AGP Model Prediction Demonstration 4 3 Predicted Mean Samples Standard Predicted Variance Predicted mean Predicted variance 21 / 42

22 T 2 Test Statistical Testing Profile Value Location Whether the new profile deviates from f (x) within acceptable region Statistical testing based on the samples (where to sample?) 22 / 42

23 T 2 Test T 2 Test If the new surface conforms with the model, Y l N( µ l, Σ l ) Reducing surface comparison to multivariate normal data comparison H 0 : Y l N( µ l, Σ l ) H 1 : Y l N( µ l, Σ l ). Testing statistic: T 2 l = (Y l µ l ) T Σ 1 l (Y l µ l ), Under H 0, T 2 l χ 2 n l. Reject H 0 when T 2 l > H T. 23 / 42

24 Generalized likelihood ratio test GLR Test Only focus on a certain class of alternative models Another deviation source is considered as the alternative models Y l (x) = f (x) + ɛ l (x) + ξ(x) ξ(x) is a realization of another GP(δ, w( )). Suitable to model the global change effects Testing hypothesis H 0 :Y l (x) = f (x) + ɛ l (x) H 1 :Y l (x) = f (x) + ɛ l (x) + ξ(x) 24 / 42

25 Generalized likelihood ratio test GLR Test With finite number of observations Testing hypothesis: H 0 :Y l N( µ l, Σ l ) H 1 :Y l N( µ l + δ1 nl, Σ l + Σ w ) for some nonzero δ, γ 2, θ l GLR statistic: [ ] sup δ,γ 2,θl det( Σ l + Σ w ) 1/2 exp (Y l µ l δ1 nl ) T ( Σ l + Σ w ) 1 (Y l µ l δ1 nl )/2 R l = 2 ln [ ] det( Σ l ) 1/2 exp (Y l µ l ) T 1 Σ l (Y l µ l )/2 R l equal mixture χ χ2 2 asymptotically under H 0. Reject H 0 when: R l > H R. 25 / 42

26 Generalized likelihood ratio test Summary N 0 IC Units n i on Unit i AGP Model ( µ l, Σ l ) X l New Unit Y l T 2 Test GLR Test Accept Continue Reject Disposal 26 / 42

27 Approximation and Estimation Performance Approximation Performance Standard profile (Shpak 1995): f (x) = sin(x) + sin(10x/3) + log(x) 0.84x + 3 Spatially correlated error: ɛ(x) GP(0, 0.05 v( 5)) Predicted mean f(x) Measurements AGP OGP Predicted variance AGP OGP x OGP Model: Y i (x) = µ + ɛ i (x) x 27 / 42

28 Approximation and Estimation Performance Bias and RMSE of MLE Accuracy of the MLE with different sample size: (N 0, n 0 ) µ = 1 σ 2 = 0.2 θ 1 = 3 τ 2 = 0.05 θ 2 = 10 (10,10) Bias RMSE (10,20) Bias RMSE (20,10) Bias RMSE (20,20) Bias RMSE / 42

29 Monitoring Performance Three Change Scenarios Y (x) = f (x) + ɛ(x) Mean (µ) Variance (σ 2 2 ) Correlation (θ 2 ) 4 3 Standard Shifted 4 3 Standard Shifted 4 3 Standard Shifted / 42

30 Monitoring Performance Performance of Different Tests Three tests to compare: Max-Min Test GLR Test T 2 Test 1.0 Mean MaxMin GLR T2 Variance Correlation Beta error Shift magnitude / 42

31 Monitoring Performance Effect of Testing Sample Size (n l ) 31 / 42

32 Monitoring Performance Effect of In Control Sample Size (N 0, n 0 ) (10,10) (20,10) (10,20) (20,20) Mean Variance Correlation T2 T2 T Beta error Mean Variance Correlation GLR GLR GLR Shift magnitude / 42

33 Real Application Monitoring Wafer Thickness Profile Data are collected from real production plant; 8 in control wafers to construct AGP model, 30 wafers to be tested; 120 measurements from each in control wafer to construct AGP model; 480 measurements from each testing wafer to conduct tests. 33 / 42

34 Real Application Demos of Thickness Profile In control wafer #2 In control wafer #7 Approximated standard profile 34 / 42

35 Real Application p-values of the Tests T 2 GLR Significant Level 0.7 p Value Wafer Surfaces 35 / 42

36 Real Application Rejected Wafers (p-values) #12 (T 2 : GLR: ) #23 (T 2 : GLR: ) #24 (T 2 : GLR: ) #26 (T 2 : GLR: ) #28 (T 2 : GLR:0) #30 (T 2 : GLR: ) 36 / 42

37 Open issues Optimal Design for AGP Nonparametric model, Fisher information matrix is not enough Ordinary space filling design for deterministic experiments does not consider geometric feature does not consider the error process 37 / 42

38 Open issues Optimality Criteria Prediction accuracy: minimize (integrated) RMSE Determine N 0, n 0 and x ij Approximation accuracy of f (x) and error process estimation σ 2 2, θ 2 Sequential allocation strategy (Ankenman et al. 2010) Detection power: minimize β error T 2 test: when only µ changes, the Mahalanobis distance δ Σ 1 l δ determines the power, where Σ l = Σ l Σ l,0 Σ 1 0 ΣT l,0 Constant mean shift: max Xl 1 Σ 1 D-optimal: max Xl det Σ 1 l l 1 = min Xl det Σ l 38 / 42

39 Open issues GP with Covariates Surface profile depends on other factors: speed, force, materials, etc. GP model GP with independent errors Ankenman et al. (2010) GP with dependent errors Multivariate output/response Co-kriging Zhou et al. (2011); Qian et al. (2008) Different distance metrics Surface response 39 / 42

40 Open issues Conclusion AGP model is suitable to approximate surface profile and quantify dependent deviations; A simple and flexible framework for process monitoring Need to further consider design issues and extend the model to the case with covariate 40 / 42

41 Reference I Ankenman, B., Nelson, B., and Staum, J. (2010), Stochastic kriging for simulation metamodeling, Operations Research, 58, Cressie, N. (1993), Statistics for Spatial Data, revised edition, vol. 928, Wiley, New York. Doering, R. and Nishi, Y. (2007), Handbook of semiconductor manufacturing technology, CRC Press, Boca Raton, FL. Jin, R., Chang, C., and Shi, J. (2012), Sequential measurement strategy for wafer geometric profile estimation, IIE Transactions, 44, Qian, P. Z. G., Wu, H., and Wu, C. F. J. (2008), Gaussian Process Models for Computer Experiments with Qualitative and Quantitative Factors, Technometrics, 50, Rasmussen, C. E. and Williams, C. K. I. (2006), Gaussian Processes for Machine Learning, MIT Press, Boston. Sacks, J., Welch, W., Mitchell, T., and Wynn, H. (1989), Design and analysis of computer experiments, Statistical science, 4, Santner, T., Williams, B., and Notz, W. (2003), The design and analysis of computer experiments, Springer, New York. Shpak, A. (1995), Global optimization in one-dimensional case using analytically defined derivatives of objective function, Computer Science Journal of Moldova, 3, Zhao, H., Jin, R., Wu, S., and Shi, J. (2011), Pde-constrained gaussian process model on material removal rate of wire saw slicing process, Journal of Manufacturing Science and Engineering, 133, Zhou, Q., Qian, P. Z. G., and Zhou, S. (2011), A Simple Approach to Emulation for Computer Models with Qualitative and Quantitative Factors, Technometrics, 53,

42 Thanks and questions

Monitoring Wafer Geometric Quality using Additive Gaussian Process Model

Monitoring Wafer Geometric Quality using Additive Gaussian Process Model Monitoring Wafer Geometric Quality using Additive Gaussian Process Model Linmiao Zhang, Kaibo Wang, and Nan Chen Department of Industrial and Systems Engineering, National University of Singapore Department

More information

arxiv: v1 [stat.me] 24 May 2010

arxiv: v1 [stat.me] 24 May 2010 The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian

More information

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.

More information

Limit Kriging. Abstract

Limit Kriging. Abstract Limit Kriging V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu Abstract A new kriging predictor is proposed.

More information

Modeling Tunnel Profile in Presence of Coordinate Errors: A Gaussian Process Based Approach

Modeling Tunnel Profile in Presence of Coordinate Errors: A Gaussian Process Based Approach Modeling Tunnel Profile in Presence of Coordinate Errors: A Gaussian Process Based Approach Chen Zhang 1, Yong Lei 2, Linmiao Zhang 3, and Nan Chen 1 1 Department of Industrial Systems Engineering and

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM Electronic Companion Stochastic Kriging for Simulation Metamodeling

e-companion ONLY AVAILABLE IN ELECTRONIC FORM Electronic Companion Stochastic Kriging for Simulation Metamodeling OPERATIONS RESEARCH doi 10.187/opre.1090.0754ec e-companion ONLY AVAILABLE IN ELECTRONIC FORM informs 009 INFORMS Electronic Companion Stochastic Kriging for Simulation Metamodeling by Bruce Ankenman,

More information

Statistical Surface Monitoring by Spatial-Structure Modeling

Statistical Surface Monitoring by Spatial-Structure Modeling Statistical Surface Monitoring by Spatial-Structure Modeling ANDI WANG Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong KAIBO WANG Tsinghua University, Beijing 100084,

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Gaussian Processes for Computer Experiments

Gaussian Processes for Computer Experiments Gaussian Processes for Computer Experiments Jeremy Oakley School of Mathematics and Statistics, University of Sheffield www.jeremy-oakley.staff.shef.ac.uk 1 / 43 Computer models Computer model represented

More information

Introduction to emulators - the what, the when, the why

Introduction to emulators - the what, the when, the why School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1 What is a simulator? A simulator is a computer code

More information

Optimal Designs for Gaussian Process Models via Spectral Decomposition. Ofir Harari

Optimal Designs for Gaussian Process Models via Spectral Decomposition. Ofir Harari Optimal Designs for Gaussian Process Models via Spectral Decomposition Ofir Harari Department of Statistics & Actuarial Sciences, Simon Fraser University September 2014 Dynamic Computer Experiments, 2014

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Gaussian Process Regression Model in Spatial Logistic Regression

Gaussian Process Regression Model in Spatial Logistic Regression Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005

More information

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics Kriging by Example: Regression of oceanographic data Paris Perdikaris Brown University, Division of Applied Mathematics! January, 0 Sea Grant College Program Massachusetts Institute of Technology Cambridge,

More information

Kriging and Alternatives in Computer Experiments

Kriging and Alternatives in Computer Experiments Kriging and Alternatives in Computer Experiments C. F. Jeff Wu ISyE, Georgia Institute of Technology Use kriging to build meta models in computer experiments, a brief review Numerical problems with kriging

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Stéphanie Allassonnière CIS, JHU July, 15th 28 Context : Computational Anatomy Context and motivations :

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

OPTIMAL DESIGN INPUTS FOR EXPERIMENTAL CHAPTER 17. Organization of chapter in ISSO. Background. Linear models

OPTIMAL DESIGN INPUTS FOR EXPERIMENTAL CHAPTER 17. Organization of chapter in ISSO. Background. Linear models CHAPTER 17 Slides for Introduction to Stochastic Search and Optimization (ISSO)by J. C. Spall OPTIMAL DESIGN FOR EXPERIMENTAL INPUTS Organization of chapter in ISSO Background Motivation Finite sample

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

Sensitivity analysis in linear and nonlinear models: A review. Introduction

Sensitivity analysis in linear and nonlinear models: A review. Introduction Sensitivity analysis in linear and nonlinear models: A review Caren Marzban Applied Physics Lab. and Department of Statistics Univ. of Washington, Seattle, WA, USA 98195 Consider: Introduction Question:

More information

Density Estimation: ML, MAP, Bayesian estimation

Density Estimation: ML, MAP, Bayesian estimation Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum

More information

Proceedings of the 2018 Winter Simulation Conference M. Rabe, A. A. Juan, N. Mustafee, A. Skoogh, S. Jain, and B. Johansson, eds.

Proceedings of the 2018 Winter Simulation Conference M. Rabe, A. A. Juan, N. Mustafee, A. Skoogh, S. Jain, and B. Johansson, eds. Proceedings of the 2018 Winter Simulation Conference M. Rabe, A. A. Juan, N. Mustafee, A. Skoogh, S. Jain, and B. Johansson, eds. METAMODEL-ASSISTED RISK ANALYSIS FOR STOCHASTIC SIMULATION WITH INPUT UNCERTAINTY

More information

Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

A Process over all Stationary Covariance Kernels

A Process over all Stationary Covariance Kernels A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that

More information

Uncertainty quantification and calibration of computer models. February 5th, 2014

Uncertainty quantification and calibration of computer models. February 5th, 2014 Uncertainty quantification and calibration of computer models February 5th, 2014 Physical model Physical model is defined by a set of differential equations. Now, they are usually described by computer

More information

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Use of Design Sensitivity Information in Response Surface and Kriging Metamodels

Use of Design Sensitivity Information in Response Surface and Kriging Metamodels Optimization and Engineering, 2, 469 484, 2001 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Use of Design Sensitivity Information in Response Surface and Kriging Metamodels J. J.

More information

Why experimenters should not randomize, and what they should do instead

Why experimenters should not randomize, and what they should do instead Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction

More information

Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds.

Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds. Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds. EXTENDED KERNEL REGRESSION: A MULTI-RESOLUTION METHOD TO COMBINE

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Prediction of double gene knockout measurements

Prediction of double gene knockout measurements Prediction of double gene knockout measurements Sofia Kyriazopoulou-Panagiotopoulou sofiakp@stanford.edu December 12, 2008 Abstract One way to get an insight into the potential interaction between a pair

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department

More information

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015

More information

The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan

The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan Background: Global Optimization and Gaussian Processes The Geometry of Gaussian Processes and the Chaining Trick Algorithm

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

Machine Learning Basics: Maximum Likelihood Estimation

Machine Learning Basics: Maximum Likelihood Estimation Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning

More information

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Gaussian Processes and Complex Computer Models

Gaussian Processes and Complex Computer Models Gaussian Processes and Complex Computer Models Astroinformatics Summer School, Penn State University June 2018 Murali Haran Department of Statistics, Penn State University Murali Haran, Penn State 1 Modeling

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes

Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes François Bachoc Department of Statistics and Operations Research, University of Vienna (Former PhD

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Tilburg University. Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack

Tilburg University. Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack Tilburg University Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack Document version: Early version, also known as pre-print Publication

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed Spatial Backfitting of Roller Measurement Values from a Florida Test Bed Daniel K. Heersink 1, Reinhard Furrer 1, and Mike A. Mooney 2 1 Institute of Mathematics, University of Zurich, CH-8057 Zurich 2

More information

Spatial Modeling and Prediction of County-Level Employment Growth Data

Spatial Modeling and Prediction of County-Level Employment Growth Data Spatial Modeling and Prediction of County-Level Employment Growth Data N. Ganesh Abstract For correlated sample survey estimates, a linear model with covariance matrix in which small areas are grouped

More information

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Brian Williams and Rick Picard LA-UR-12-22467 Statistical Sciences Group, Los Alamos National Laboratory Abstract Importance

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University

More information

20: Gaussian Processes

20: Gaussian Processes 10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction

More information

Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro

Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging., Delphine Dupuy and Laurent Carraro Historical context First introduced in the field of geostatistics (Matheron,

More information

Introduction to Gaussian Process

Introduction to Gaussian Process Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

Quasi-likelihood Scan Statistics for Detection of

Quasi-likelihood Scan Statistics for Detection of for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration

Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration Emile Contal David Buffoni Alexandre Robicquet Nicolas Vayatis CMLA, ENS Cachan, France September 25, 2013 Motivating

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

A Gaussian state-space model for wind fields in the North-East Atlantic

A Gaussian state-space model for wind fields in the North-East Atlantic A Gaussian state-space model for wind fields in the North-East Atlantic Julie BESSAC - Université de Rennes 1 with Pierre AILLIOT and Valï 1 rie MONBET 2 Juillet 2013 Plan Motivations 1 Motivations 2 Context

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

mlegp: an R package for Gaussian process modeling and sensitivity analysis

mlegp: an R package for Gaussian process modeling and sensitivity analysis mlegp: an R package for Gaussian process modeling and sensitivity analysis Garrett Dancik January 30, 2018 1 mlegp: an overview Gaussian processes (GPs) are commonly used as surrogate statistical models

More information

Constrained Gaussian processes: methodology, theory and applications

Constrained Gaussian processes: methodology, theory and applications Constrained Gaussian processes: methodology, theory and applications Hassan Maatouk hassan.maatouk@univ-rennes2.fr Workshop on Gaussian Processes, November 6-7, 2017, St-Etienne (France) Hassan Maatouk

More information

Probabilistic & Bayesian deep learning. Andreas Damianou

Probabilistic & Bayesian deep learning. Andreas Damianou Probabilistic & Bayesian deep learning Andreas Damianou Amazon Research Cambridge, UK Talk at University of Sheffield, 19 March 2019 In this talk Not in this talk: CRFs, Boltzmann machines,... In this

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Multi-fidelity co-kriging models

Multi-fidelity co-kriging models Application to Sequential design Loic Le Gratiet 12, Claire Cannamela 3 1 EDF R&D, Chatou, France 2 UNS CNRS, 69 Sophia Antipolis, France 3 CEA, DAM, DIF, F-91297 Arpajon, France ANR CHORUS April 3, 214

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Sequential adaptive designs in computer experiments for response surface model fit

Sequential adaptive designs in computer experiments for response surface model fit Statistics and Applications Volume 6, Nos. &, 8 (New Series), pp.7-33 Sequential adaptive designs in computer experiments for response surface model fit Chen Quin Lam and William I. Notz Department of

More information

Bootstrap & Confidence/Prediction intervals

Bootstrap & Confidence/Prediction intervals Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Machine learning, shrinkage estimation, and economic theory

Machine learning, shrinkage estimation, and economic theory Machine learning, shrinkage estimation, and economic theory Maximilian Kasy December 14, 2018 1 / 43 Introduction Recent years saw a boom of machine learning methods. Impressive advances in domains such

More information