Mixed Hierarchical Models for the Process Environment Barry M. Wise, Robert T. Roginski, Neal B. Gallagher and Jeremy M. Shaver Eigenvector Research Inc., Wenatchee, WA, USA
Abstract EffecFve monitoring and control of chemical processes ohen requires more than single quanftafve regression models or qualitafve classificafon models. MulFple models are ohen needed, and which model to apply can be a funcfon of current process condifons. The "which model" quesfon can be determined by rule- of- thumb heurisfcs, criteria based on single process variables, or the output of previously applied regression or classificafon models. Process problems are somefmes best solved with a hierarchical construcfon of "mixed models," e.g. different types of classificafon, regression, simple math and logic all in the same structure. Several examples of these mixed hierarchical models are demonstrated and discussed.
Mixed Hierarchical Models Many classificafon and regression problems are too complex to be handled with a single model Rules based on model outputs or single variables can be used to break problems into simple pieces
Example: ClassificaFon on ARCH ARCH classic data set of NaFve American arffacts measured by XRF (10 variables) 63 knowns from 4 sources Classify 12 unknowns
PCA Scores of ARCH 5 4 Samples/Scores Plot of arch K BL SH AN 3 Scores on PC 2 (21.12%) 2 1 0 Complete separafon, use PCA to idenffy 1 2 5 4 3 2 1 0 1 2 3 4 Scores on PC 1 (53.41%)
Scores Without ANA 3 2 Then use PLS- DA to split K from SH Samples/Scores Plot of arch K BL SH 1 Scores on PC 2 (19.41%) 0 1 2 3 Use PLS- DA to split BL from other two classes 4 6 4 2 0 2 4 6 8 Scores on PC 1 (50.38%)
Hierarchical Model Hierarchical model in HMB interface Output on unknowns
Nonlinear Dynamic Process Single input single output (SISO) process Lab system, intenfonally non- linear Use past 6 values of input to predicfon output (Finite Impulse Response)
Global PLS Model Results 14 Y CV Predicted 1 level 12 10 8 6 R 2 = 0.932 2 Latent Variables RMSEC = 0.85013 RMSECV = 0.85034 CV Bias = 2.0068e-05 High Range Mid Range 4 2 0 2 4 6 8 10 12 Y Measured 1 level Low Range
Low- Range Model Results
Mid- Range Model Results
High- Range Model Results
Single Layer Hierarchical Model If Q is too large, throw error If Predicted Y1 is > 8, apply "High- Range" model If Predicted Y1 is < 3.8, apply "Low- Range" model Otherwise, apply "Mid- Range" model
Hierarchical Model Output 12 10 RMSEP = 0.28 8 Predicted level 6 4 2 Bad Q (no predicfon High- Range Low- Range Mid- Range 1 2 3 4 0 0 2 4 6 8 10 12 level
Hierarchical Model Output 70 60 50 Hotellings T^2 40 30 20 10 0 0 2 4 6 8 10 12 level
Hierarchical Model Output 30 25 20 Q Residuals 15 10 5 0 0 2 4 6 8 10 12 level
Add Layer of Output TesFng If Predicted Y1 is > 8, apply "High- Range" local model and test outputs If Q from "High- Range" model is too high, error Otherwise, return High- Range predicfon Similar tests on Low- Range and Mid- Range Models
2- Layer Hierarchical Model Output 12 10 RMSEP = 0.28 Predicted level 8 6 4 2 2 2 3 1 3 2 4 1 4 2 0 0 2 4 6 8 10 12 level
2- Layer Hierarchical Model Output 6 5 4 Q Residuals 3 2 1 0 0 2 4 6 8 10 12 level
12 BPN- ANN Model (1 layer, 2 nodes) Y Predicted 1 level 10 8 6 4 R 2 = 0.999 2 Layer 1 Nodes RMSEC = 0.13838 RMSECV = 0.14046 RMSEP = 0.2422 Calibration Bias = -0.00010954 CV Bias = 0.00049479 Prediction Bias = 0.10643 Use Global PLS to filter out bad samples 2 0 0 2 4 6 8 10 12 Y Measured 1 level
Filtering of PredicFons Apply Model If Q is too high, error If predicfon is > 0, return predicfon (re- apply model) If predicfon is <= 0, return zero
ClassificaFon of Placebo and AcFve FormulaFon FRESH PERSPECTIVES Enhanced Classification of Placebo and Active Formulations via Hierarchical Modeling Michael Dotlich, M.Sc. 1, Richard M. Kattner, M.Sc. 1, Robert Roginski, Ph.D. 2 and Jeremy Shaver, Ph.D. 2 1 Eli Lilly and Company 2 Eigenvector Research, Inc. Michael Dotlich, M.Sc., is a Research Scientist in analytical research and development at Eli Lilly and Company. He works in the Lilly Research Laboratories validating methods of testing for clinical trial materials release. His active research is focused on the development of spectroscopic methods using different analytical techniques and chemometrics for identification and quantitation of raw materials and drug products. He earned his M.Sc. in applications of Raman spectroscopy from Marquette University, Milwaukee WI. Dr. Bob Roginski is a Senior Applications Scientist with Eigenvector Research, Inc., where he provides consulting services, instruction, and software development in the area of chemometrics. Previously, Bob served in engineering roles specializing in process analytical technology at Eli Lilly & Co., Searle/Pharmacia/Pfizer, and Amoco Corporation. Bob received his Ph.D. in Chemical Engineering from the University of Illinois in 1987, and has collaborated on numerous peerreviewed publications and outside presentations. Bob has special interests in spectroscopy as applied to PAT and using chemometrics to determine the health of continuous processes. Richard M. Kattner, M.Sc., is Associate Consultant Chemist in the analytical research and development at Eli Lilly and Company. current work in the Lilly Research Laboratories deals with developing and validating methods for the testing and release of clinical trial materials. His His current focus is the development and validation of spectroscopic methods using different analytical techniques and chemometrics for testing active/placebo tablet and solutions. earned his M.Sc. in Chemistry focusing in Physical Organic Chemistry at the University of North Texas, Denton, TX. Dr. Jeremy Shaver is currently the Chief of Technology Development at Eigenvector Research, Inc., which he joined in 2001. He received a BA in Chemistry from the College of Wooster in 1991 and a Ph.D. in Analytical Chemistry from Duke University in 1995. He Introduction A placebo-controlled study is a means of testing a drug for safety and efficacy in a group of subjects that receive the treatment. Current placebo identity tests typically utilize an HPLC identity method for the active compound to confirm the absence of the active (i.e., negative identity). In this review, the development and application of transmission Raman spectroscopy (TRS) with chemometric modeling for positive placebo identification testing will be applied to drug products, illustrated for several compounds and their respective placebos. Placebo identification tests are a clinical manufacturing requirement, and when implemented in a negative mode, data are evaluated against the specification There is no active detected. Clinical placebos share the same physical appearance as the active tablets or capsules, as required for blinded studies, hence definitive identification of both the active and placebo (absence of active) are necessary release tests. Spectroscopic test methods utilizing a standard library provide a robust approach for evaluating the chemical identity of both placebos and actives. In addition, spectral testing using chemometric models can compare placebo results against a database of numerous placebos and active drugs instead of assaying for a single active ingredient. Spectroscopic methods utilizing rapid, chemometric-based spectroscopic technologies create efficiencies and minimize workload due to minimal sample preparation and automated data analysis. Finally, chemometric models provide benefits over traditional spectral comparisons such as eliminating a need for storage and maintenance of reference standards (e.g., API, tablets and capsules) and reduce subjectivity in determining if sample data compares favorably to a reference standard or demonstrates no active XYZ present, especially for complex drug excipient matrices. Experimental For the experiments reported here, a Cobalt transmission Raman instrument (TRS100) was used with the following settings: Laser Power: 0.65W Exposure: 0.5 sec. Accumulations: 180 Detector: CCD Read Optics: Small Laser Spot Diameter: 2mm Scan Range: 40 2400 cm -1 Placebo idenfficafon tests required Usually done by HPLC In this example used transmission Raman 12 unique acfve drugs and their placebos, 34 strengths, tablets and capsules 1 American Pharmaceutical Review Endotoxin Supplement 2013
Scores of Placebos 20 15 10 Samples/Scores Plot of Multiple SPC Files 4 3 Samples/Scores Plot of Multiple SPC Files all other placebos NP 2 placebos NP 1 placebos 95% Confidence Level Scores on PC 3 (14.83%) 5 0 5 10 15 all other placebos NP 2 placebo NP 1 placebo Scores on PC 2 (2.06%) 2 1 0 1 20 95% Confidence Level 2 25 20 10 0 10 20 Scores on PC 2 (24.43%) 20 15 10 5 0 5 10 15 Scores on PC 1 (97.25%) with GLS preprocessing
Class PredicFon ProbabiliFes Samples/Scores Plot of Multiple SPC Files Samples/Scores Plot of Multiple SPC Files 1.2 1 Class Pred Probability capsule placebos 1 0.8 0.6 0.4 0.2 0 capsule placebos Class Pred Probability tablet placebos 0.8 0.6 0.4 0.2 tablet placebos 0.2 100 200 300 400 500 600 Sample 0 100 200 300 400 500 600 Sample Capsule placebos versus all other products Tablet placebos versus remaining products
Final Models Samples/Scores Plot of Multiple SPC Files Samples/Scores Plot of Multiple SPC Files Class Pred Probability NP 1 and NP 2 placebos 1 0.8 0.6 0.4 0.2 NP 1 & NP 2 placebos NP 2 active Class Pred Probability NP 1 & NP 2 placebos 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 NP 2 active NP 1 & NP 2 placebos 0 100 200 300 400 500 600 Sample 0 550 560 570 580 590 600 Sample
Final Hierarchical Model
Conclusions Many classificafon and even regression problems are too complex to be done with a single model Easier when broken into smaller pieces Different types of models may be most suitable for each step