A new procedure for sensitivity testing with two stress factors

Similar documents
Three-phase Sequential Design for Sensitivity Experiments

Variations. ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Work, Energy, and Power. Chapter 6 of Essential University Physics, Richard Wolfson, 3 rd Edition

SECTION 5: POWER FLOW. ESE 470 Energy Distribution Systems

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Lecture 11. Kernel Methods

SECTION 7: FAULT ANALYSIS. ESE 470 Energy Distribution Systems

Math 171 Spring 2017 Final Exam. Problem Worth

General Strong Polarization

CS249: ADVANCED DATA MINING

3.2 A2 - Just Like Derivatives but Backwards

Mathematics Paper 2 Grade 12 Preliminary Examination 2017

ECE 6540, Lecture 06 Sufficient Statistics & Complete Statistics Variations

Lecture 6. Notes on Linear Algebra. Perceptron

Review for Exam Hyunse Yoon, Ph.D. Assistant Research Scientist IIHR-Hydroscience & Engineering University of Iowa

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra

Lesson 7: Linear Transformations Applied to Cubes

PART I (STATISTICS / MATHEMATICS STREAM) ATTENTION : ANSWER A TOTAL OF SIX QUESTIONS TAKING AT LEAST TWO FROM EACH GROUP - S1 AND S2.

Exam Programme VWO Mathematics A

Module 7 (Lecture 27) RETAINING WALLS

Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Lecture 2: Plasma particles with E and B fields

Gibbs Sampler Derivation for Latent Dirichlet Allocation (Blei et al., 2003) Lecture Notes Arjun Mukherjee (UH)

Monte Carlo (MC) simulation for system reliability

Radial Basis Function (RBF) Networks

MATH 1080: Calculus of One Variable II Fall 2018 Textbook: Single Variable Calculus: Early Transcendentals, 7e, by James Stewart.

Ballistic Resistance Testing Techniques

Expectation Propagation performs smooth gradient descent GUILLAUME DEHAENE

Estimation of cumulative distribution function with spline functions

CHAPTER 5 Wave Properties of Matter and Quantum Mechanics I

Quantum Mechanics. An essential theory to understand properties of matter and light. Chemical Electronic Magnetic Thermal Optical Etc.

General Strong Polarization

Hopfield Network for Associative Memory

Introduction to Density Estimation and Anomaly Detection. Tom Dietterich

Charged-Particle Interactions in Matter

Review for Exam Hyunse Yoon, Ph.D. Adjunct Assistant Professor Department of Mechanical Engineering, University of Iowa

Test Strategies for Experiments with a Binary Response and Single Stress Factor Best Practice

ME5286 Robotics Spring 2017 Quiz 2

Random Vectors Part A

PHY103A: Lecture # 4

Extreme value statistics: from one dimension to many. Lecture 1: one dimension Lecture 2: many dimensions

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Charge carrier density in metals and semiconductors

Analyzing and improving the reproducibility of shots taken by a soccer robot

Chapter 22 : Electric potential

Approximate Second Order Algorithms. Seo Taek Kong, Nithin Tangellamudi, Zhikai Guo

(1) Correspondence of the density matrix to traditional method

Hybrid Extended Kalman Filtering and Noise Statistics Optimization for Produce Wash State Estimation

Specialist Mathematics 2019 v1.2

SECTION 8: ROOT-LOCUS ANALYSIS. ESE 499 Feedback Control Systems

Photon Interactions in Matter

Quantum Chosen-Ciphertext Attacks against Feistel Ciphers

On The Cauchy Problem For Some Parabolic Fractional Partial Differential Equations With Time Delays

Terms of Use. Copyright Embark on the Journey

P.3 Division of Polynomials

Haar Basis Wavelets and Morlet Wavelets

" = Y(#,$) % R(r) = 1 4& % " = Y(#,$) % R(r) = Recitation Problems: Week 4. a. 5 B, b. 6. , Ne Mg + 15 P 2+ c. 23 V,

(1) Introduction: a new basis set

Yang-Hwan Ahn Based on arxiv:

Lecture 3 Optimization methods for econometrics models

Chemical Engineering 412

Lecture 3. Linear Regression

Lesson 1: Successive Differences in Polynomials

10.4 The Cross Product

Chapter 5: Spectral Domain From: The Handbook of Spatial Statistics. Dr. Montserrat Fuentes and Dr. Brian Reich Prepared by: Amanda Bell

International Journal of Mathematical Archive-5(3), 2014, Available online through ISSN

Lecture No. 1 Introduction to Method of Weighted Residuals. Solve the differential equation L (u) = p(x) in V where L is a differential operator

Comparing Different Estimators of Reliability Function for Proposed Probability Distribution

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

Unit 6 Note Packet List of topics for this unit/assignment tracker Date Topic Assignment & Due Date Absolute Value Transformations Day 1

7.3 The Jacobi and Gauss-Seidel Iterative Methods

Review of Last Class 1

Rotational Motion. Chapter 10 of Essential University Physics, Richard Wolfson, 3 rd Edition

Grover s algorithm. We want to find aa. Search in an unordered database. QC oracle (as usual) Usual trick

TECHNICAL NOTE AUTOMATIC GENERATION OF POINT SPRING SUPPORTS BASED ON DEFINED SOIL PROFILES AND COLUMN-FOOTING PROPERTIES

2.4 Error Analysis for Iterative Methods

General Strong Polarization

Efficient Robbins-Monro Procedure for Binary Data

Secondary 3H Unit = 1 = 7. Lesson 3.3 Worksheet. Simplify: Lesson 3.6 Worksheet

Generalized Linear Mixed Models for Longitudinal Data with Missing Values: A Monte Carlo EM Approach

F.4 Solving Polynomial Equations and Applications of Factoring

Diffusion modeling for Dip-pen Nanolithography Apoorv Kulkarni Graduate student, Michigan Technological University

Integrating Rational functions by the Method of Partial fraction Decomposition. Antony L. Foster

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides

General Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests

Classical RSA algorithm

General Strong Polarization

Wave Motion. Chapter 14 of Essential University Physics, Richard Wolfson, 3 rd Edition

Angular Momentum, Electromagnetic Waves

Unit II. Page 1 of 12

PHL424: Nuclear Shell Model. Indian Institute of Technology Ropar

A Posteriori Error Estimates For Discontinuous Galerkin Methods Using Non-polynomial Basis Functions

A Step Towards the Cognitive Radar: Target Detection under Nonstationary Clutter

CS249: ADVANCED DATA MINING

Uncertain Compression & Graph Coloring. Madhu Sudan Harvard

CDS 101/110: Lecture 6.2 Transfer Functions

SECTION 7: STEADY-STATE ERROR. ESE 499 Feedback Control Systems

Estimate by the L 2 Norm of a Parameter Poisson Intensity Discontinuous

Transcription:

A new procedure for sensitivity testing with two stress factors C.F. Jeff Wu Georgia Institute of Technology Sensitivity testing : problem formulation. Review of the 3pod (3-phase optimal design) procedure with one stress factor. A new procedure for two stress factors, partly inspired by 3pod; not a trivial extension. An illustration. Comments and further work. (joint work with Dianpeng Wang, Beijing Inst. of Technology) 1

Sensitivity testing Stress/stimulus level x: launching velocity, drop height Response/nonresponse y = 1 or 0: penetrate, explode plate initial velocity, x (x X) y=1 bullet x<x y=0 X=unknown critical level (a random quantity) 2

Quantal response curve Quantal response curve F(x) = prob (y = 1 x); interested in estimating the p-th quantile x p with F(x p ) = p, p typically high, e.g., p = 0.9, 0.99, 0.999. Useful for certification or quantification of test items. Common in military and heavy industry applications Choice of F: probit, logit, or skewed version Problem/challenge: find a sequential design procedure to estimate x p efficiently and for small samples 3

Three-phase optimal design A trilogy of search-estimate-approximate: I. (search) to generate y = 1 and y = 0, to close-in on region of interest and to obtain overlapping data pattern II. (estimate) use D-optimality criterion to generate design points; spread out design points III. (approximate) Taking μμ + FF 1 pp σσ, where μμ, σσ are MLE of μ,σ based on data in I-II, as the starting value, use the Robbins-Monro-Joseph (RMJ) procedure to generate design points 3-phase optimal design, dubbed as 3pod (for its steady performance ) Wu-Tian (2014, JSPI) 4

Phase I of 3pod It has three stages I1, I2, I3 I1. (quickly obtain y = 1 and y = 0). Choose (μ min, μ max ) for location parameter μ and σ g as guessed value of scale parameter σ and μ max - μ min 6σ g. Take y 1 and y 2 at xx 1 = 3 μμ 4 mmmmmm + 1 μμ 4 mmaaaa, xx 2 = 1 μμ 4 mmmmmm + 3 μμ 4 mmmmmm. Four cases result: (i) y 1 = y 2 = 0 x 1, x 2 to the left of μ; take x 3 = μ max +1.5σ g. If y 3 = 1, move to I2. If y 3 = 0, take x 4 = μ max +3σ g. If y 4 = 1, move to I2. If y 4 = 0, range not large; increase x by 1.5σ g until y=1. 5

Phase I of 3pod (continued) (ii) y 1 = y 2 = 1, do the mirror image of (i) (iii) y 1 = 0, y 2 = 1: good! Move to I2 (iv) y 1 = 1, y 2 = 0: range too narrow around μ, expand it by taking x 3 = μ min -3σ g, x 4 = μ max +3σ g ; move to I2 Note: I1 is like dose ranging in dose-response studies 6

Trapped in separation? Let M 0 = largest x value with y = 0, m 1 = smallest x value with y = 1. Overlapping iff M 0 > m 1; separation iff M 0 m 1 Running test within the separation interval [M 0, m 1 ] will forever be trapped in separation. When the interval is small, get out to avoid logjam X X X X X X MM 0 mm 1 7

I2: stage 2 of phase I If overlapping in data from I1, move to I3. Otherwise, take next level at (=MLE assuming probit and σ g ); if overlapping, move to I3. If no overlapping, update M 0, m 1,, take next level at until m 1 -M 0 <1.5 σ g. Then choose x levels outside the separation interval [M 0, m 1 ]. See next. Take next run at m 1 +0.3σ g ; if y = 0, overlapping, move to I3. If y = 1, next run at M 0-0.3σ g ; if y = 1, overlapping, move to I3. Otherwise it suggests σ g is too large, reduce it to σ g, repeat I2 until seeing overlapping. 8

Illustrative Example (0,22), probit, μ=10, σ=1, σ g =3, x 0.99 =11.2816 : y=1 : y=0 9

Problem formulation with two stress factors Two stress factors, xx = xx 1, xx 2, which are not independent. Example: temperature and voltage in detonation of ammunition. The outcome is binary data, yy = 1 or yy = 0. PP yy = 1 = GG[ff xx ], where GG is a location-scale distribution function with µ, σσ, and ff(xx) is an unknown latent function of xx. G is a standard choice like logit, probit, or a skewed version. Each quantile is a curve, ζζ pp = {xx = (xx 1, xx 2 ) R 2 ff xx = GG 1 (pp)}. 10

Two-step procedure for approximating the quantile curve The proposed two-step procedure is inspired by some ideas in the 3pod procedure. The clue comes from the ideas in phase I, whose goal is to achieve an overlapping pattern. Note that step I(1) of 3pod is to determine the region of interest by using a modified binary search and step I(2) is to break the separation pattern. 11

Two-step procedure: further details The new procedure has two steps: (I) search for an overlapping pattern, (II) approximate the quantile curves of interest. For two dimensions, an overlapping pattern means that the levels with yy = 1 and the levels with yy = 0 cannot be separated by a straight line. 12

Step I: search for overlapping pattern Assume that the investigators can make a guess of the region, xx 1LL, xx 1UU [xx 2LL, xx 2UU ], in which both outcomes can occur with high probability. Run tests at the four corners of the rectangles, xx 1LL, xx 2LL, xx 1LL, xx 2UU, xx 1UU, xx 2LL, xx 1UU, xx 2UU. There are three situations according to the types of the outcomes. 13

Only one type of outcome (i.e., y = 0 only, or y = 1 only) is observed This implies that the region of rectangle is too small or too large. Run additional tests both inside and outside the rectangle, until both yy = 1 and yy = 0 are observed. To choose additional tests outside the rectangle, we double the sides of the rectangle and keep the same center. To choose the tests inside the rectangle, we halve the sides of the rectangle and keep the same center. 14

Both y = 0 and y = 1 are observed but can be separated by a straight line (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) 15

SVM (Support Vector Machine) is used to exploring the middle region Denote CC 0 as the mean of support vectors with yy = 0, CC 1 the mean of support vectors with yy = 1. Denote kk 0 as the number of tests with yy = 0, kk 1 the number of tests with yy = 1. If DD mmmmmmmmmmmm > DD gg and kk 0 > kk 1, choose two tests on the separator with the projection of CC 1 as its center. If DD mmmmmmmmmmmm > DD gg and kk 0 kk 1, choose two tests on the separator with the projection of CC 0 as its center. The distance between these two tests is DD mmmmmmmmmmmm /2. 16

Exploring the middle region (continued) If DD mmmmmmmmmmmm DD gg, it implies that the margin is too narrow. Choose tests outside the margin to avoid being trapped in a wrong region. Choose one point at each side of the margin. The projection of the test, which is chosen from the side with yy = 1 (and resp. yy = 0), on the separator is CC 1 (and resp. CC 0 ). The distances between the new tests and the separator are both DD mmmmmmmmmmmm. Continue the SVM steps until the overlapping pattern is obtained. 17

Overlapping pattern is achieved This indicates two red dots on the diagonal and two green dots on the off-diagonal (or vice versa), which usually suggests that the initial guess of the region is too narrow. Add four tests outside the rectangle to get more information. Keep the center (i.e., the mean) of the new tests as before, and set the length of the sides to be 1.5 times the length of the initial sides and rotate it 45 degrees. 18

Step II: approximating the curve of interest XX = {xx 1, xx 2,, xx nn }, YY = {yy 1, yy 2,, yy nn } and ff = {ff 1, ff 2,, ff nn }, where ff ii = ff(xx ii ). Recall ff(xx) is a latent function in PP yy = 1 = GG[ff xx ], which connects binary yy with continuous xx. We employ a binary Gaussian process: ff GGGG(0, KK xx, xx ), where covariance function KK xx, xx = σσ 2 exp xx xx 2 /2ll. Let θθ = σσ, ll. The posterior distribution of ff: pp ff XX, YY, θθ = NN(ff;0,KK XX,θθ) nn pp(yy XX,θθ) ii=1 GG(ff ii ). 19

An alternative: curve approximation by GLM An alternative to the GP model is the use of GLM, i.e., logit or probit regression. Let ff ii = xx 1 ii, xx 2 ii, xx 1 ii,2, xx 2 ii,2, xx 1 ii xx 2 ii ββ, where ββ = ββ 1, ββ 2, ββ 3, ββ 4, ββ 5 TT. pp yy ii = 1 xx ii = GG(ff ii ), where GG is a logit or probit. ββ can be obtained based on the observations. Given a new point xx, ff can be predicted by using xx 1, xx 2, xx,2 1, xx,2 2, xx 1 xx 2 ββ. 20

Step II (continued) Given a new point xx, the posterior distribution of ff can be predicted by using the density pp ff xx, XX, YY, θθ = pp ff ff, xx, XX, θθ pp ff XX, YY, θθ dddd. Choose two new tests at xx c 1 = xx cc + aa 1 1, 0, xx cc 2 = xx cc + aa 2 0, 1. aa 1 and aa 2 are chosen such that EE ff xx cc ii = GG 1 pp, ii = 1, 2. Let the new center point be xx c = (xx cc 1 + xx cc 2)/2. In the beginning, choose xx cc = xx ss, whose latent value ff xx ss = GG 1 (0.5). If xx cc 1 and xx cc 2 are very close, choose the new xx ss from R\U, where UU is a given neighborhood of the previous starting point. After NN samples are completed, update the estimation of θθ and using the GP or logit model to approximate the curves. 21

An Illustrative example pppppppp yy = 1 xx = GG ff xx. GG zz = 1 1+exp{ zz}. ff xx = 1 50 xx 1 2 + xx 2 2 4xx 1 xx 2 + 3xx 1. The true contour of pp(yy = 1) is given in the left figure. 22

Initial guess about the experimental region The investigators can make a guess of the region of interest, 8, 14.5 8,14.5. Run tests at the four corners of the rectangles, 8, 8, 8, 14.5, 14.5, 8, 14.5, 14.5. 23

Only non-response results are observed This implies that the region of rectangle is too small or too large. Run additional tests both inside and outside the rectangle, until both yy = 1 and yy = 0 are observed. 24

Only non-response results are observed This implies that the region of rectangle is too small or too large. Run additional tests both inside and outside the rectangle, until both yy = 1 and yy = 0 are observed. 25

Both results are observed Both response (y = 1) and non-response (y = 0) are observed but can be separated by a straight line. 26

Both results are observed Both response (y = 1) and non-response (y = 0) are observed but can be separated by a straight line. SVM (Support Vector Machine) is used to exploring the middle region. Choose additional tests (black points) in the middle region. 27

Overlapping pattern is achieved Runs with (y = 0) and (y = 1) cannot be separated by a straight line. The overlapping pattern is achieved. 28

Fit GLM to approximate curves and select next tests Fit GLM based on observed data in step I.

Fit GLM to approximate curves and select next tests Fit GLM based on observed data in step I. The contours of quantiles based on GLM (color dash curves) Choose two new tests (black points).

Fit GLM to approximate curves and select next tests Fit GLM based on observed data in step I. The contours of quantiles by GLM (color dash lines) Choose two tests (black points). Run tests at the new locations.

Fit GLM to approximate curves and select next tests Re-fit the GLM based on current data.

Fit GLM to approximate curve and select next tests Re-fit the GLM based on current data. Choose new tests (black).

Fit GLM to approximate curve and select next tests Re-fit the GLM based on current data. Choose new tests. Run tests at the new locations.

An interim summary No information outside current experimental region. GLM over-fits observed data. Tests trapped in local region.

Extend the experimental region Choose four tests (black) outside the current region.

Extend the experimental region Choose tests outside the current region. Run tests at the new locations.

Extend the experimental region Choose tests outside the current region. Run tests at the new locations. Fit GLM again and choose new tests (black).

Extend the experimental region Choose tests outside the current region. Run tests at the new locations. Fit GLM again and choose new tests. Run tests.

Final fit: curve (left) approximation by GLM (true curve on the right)

Comments and further work As far as we know, there is no known procedure for sensitivity testing with two or more stress factors. But the problems are encountered in practice. A good procedure is sorely needed! The ideas in the proposed procedure are still preliminary, need to be fine-tuned and modified. Need a small numerical or simulation study to understand its performance; then do a field test. Extension to 3 or more factors. 41