Machine learning, ALAMO, and constrained regression

Size: px

Start display at page:

Download "Machine learning, ALAMO, and constrained regression"

Johnathan Jefferson
6 years ago
Views:

1 Machine learning, ALAMO, and constrained regression Nick Sahinidis Acknowledgments: Alison Cozad, David Miller, Zach Wilson

2 MACHINE LEARNING PROBLEM Build a model of output variables as a function of input variables x over a specified interval Process simulation or Experiment Independent variables: Operating conditions, inlet flow properties, unit geometry Dependent variables: Efficiency, outlet flow conditions, conversions, heat flow, etc. 2

3 DESIRED MODEL ATTRIBUTES Accurate We want to reflect the true nature of the system Simple Interpretable Usable for algebraic optimization Generated from a minimal data set Reduce experimental and simulation requirements 3

4 FITTING MODELS TO DATA

5 FITTING MODELS TO DATA

6 FITTING MODELS TO DATA

7 EXAMPLE ALAMO INPUT FILE 128 alternative models 7

8 ALAMO OUTPUT 8

9 ALAMO Automated Learning of Algebraic MOdels Start Initial sampling Build surrogate model Update training data set false Adaptive sampling Model converged? true Stop Model error Black-box function New model Current model 9

10 MODEL COMPLEXITY TRADEOFF Kriging [Krige, 63] Neural nets [McCulloch-Pitts, 43] Radial basis functions [Buhman, 00] Model accuracy Preferred region Linear response surface Model complexity 10

11 MODEL IDENTIFICATION Identify the functional form and complexity of the surrogate models Seek models that are combinations of basis functions 1. Simple basis functions 2. Radial basis functions for parametric regression 3. User specified basis functions for tailored regression 11

12 OVERFITTING AND TRUE ERROR Step 1: Define a large set of potential basis functions Step 2: Model reduction Error Ideal Model True error Empirical error Complexity Underfitting Overfitting 12

13 MODEL SELECTION CRITERIA Balance fit (sum of square errors) with model complexity (number of terms in the model; denoted by p) Corrected Akaike Information Criterion log 1 2 Mallows Cp 2 Hannan Quinn Information Criterion log 1 Bayes Information Criterion Mean Squared Error log log log 1 13

14 BRANCH AND BOUND Falk and Soland, 1969; Soland

15 BRANCH AND REDUCE Constraint propagation and duality based bounds tightening Ryoo and Sahinidis, 1995, 1996 Tawarmalani and Sahinidis, 2004 Finite branching rules Shectman and Sahinidis, 1998 Ahmed, Tawarmalani and Sahinidis, 2004 Convexification Tawarmalani and Sahinidis, 2001, 2002, 2004, 2005 Khajavirad and Sahinidis, 2012, 2013, 2014, 2016 Zorn and Sahinidis, 2013, 2013, 2014 Implemented in BARON First deterministic global optimization solver for NLP and MINLP 15

16 GLOBAL MINLP SOLVERS IN GAMS LindoGlobal (2007) If then else SCIP (2012) MIP technology BARON (2002) Couenne (2009) Trigonometrics Antigone (2013) Transformations 16

17 GLOBAL MINLP SOLVERS ON MINLPLIB2 CAN_SOLVE BARON SCIP ANTIGONE LINDOGLOBAL COUENNE Con: 1893 (1 164,321), Var: 1027 (3 107,223), Disc: 137 (1 31,824) 17

18 CPU TIME COMPARISON OF METRICS Eight benchmarks from the UCI and CMU data sets Seventy noisy data sets were generated with multicolinearity and increasing problem size (number of bases) Cp BIC AIC, MSE, HQC CPU time (s) Problem Size BIC solves more than two orders of magnitude faster than AIC, MSE and HQC Optimized directly via a single mixed integer convex quadratic model 18

19 MODEL QUALITY COMPARISON BIC leads to smaller, more accurate models Larger penalty for model complexity Spurious Variables Included Cp BIC AIC HQC MSE Problem Size % Deviation from True Model Problem Size 19

20 ALAMO Automated Learning of Algebraic MOdels Start Initial sampling Build surrogate model Update training data set Adaptive sampling Model error false Model converged? true Stop Error maximization sampling 20

21 ERROR MAXIMIZATION SAMPLING Search the problem space for areas of model inconsistency or model mismatch Find points that maximize the model error with respect to the independent variables Surrogate model Optimized using derivative free solver SNOBFIT (Huyer and Neumaier, 2008) SNOBFIT outperforms most derivative free solvers (Rios and Sahinidis, 2013) 21

22 COMPUTATIONAL RESULTS Goal Compare methods on three target metrics 1 Model accuracy 2 Data efficiency 3 Model simplicity Modeling methods compared ALAMO modeler Proposed best subset methodology The LASSO The lasso regularization Ordinary regression Ordinary least squares regression Sampling methods compared (over the same data set size) ALAMO sampler Proposed error maximization technique Single LH Single Latin hypercube (no feedback) 22

23 1 Model accuracy 2 Data efficiency 3 Model simplicity 70% of problems solved exactly Fraction of problems solved (0.005, 0.80) 80% of the problems had 0.5% error error maximization sampling ALAMO modeler the lasso Ordinary regression Normalized test error ALAMO modeler Ordinary regression the lasso single Latin hypercube Normalized test error 23

24 1 Model accuracy 2 Data efficiency 3 Model simplicity Modeling type, Median more complexity than required Number of terms in the surrogate model _ Number of terms in the true function Results over a test set of 45 known functions treated as black boxes with bases that are available to all modeling methods. 24

25 KEY INGREDIENT: OPTIMIZATION Surrogate model New surrogate model Surrogate model identification Simple, accurate model identification Integer optimization Error maximization sampling More information found per simulated data point Derivative free optimization 25

26 CONSTRAINED REGRESSION Fits the true function better Fits the data better Extrapolation zone Data space Safe extrapolation 26

CONSTRAINED REGRESSION Standard regression Constrained regression easy Surrogate model tough Challenging due to the semi infinite nature of the

27 CONSTRAINED REGRESSION Standard regression Constrained regression easy Surrogate model tough Challenging due to the semi infinite nature of the regression constraints Use intuitive restrictions among predictor and response variables to infer nonintuitive relationships between regression parameters 27

28 IMPLIED PARAMETER RESTRICTIONS Step 1: Formulate constraint in z and x space Step 2: Identify a sufficient set of β space constraints 1 parametric constraint 4 β constraints Global optimization problems solved with BARON 28

29 TYPES OF RESTRICTIONS Response bounds Individual responses Multiple responses pressure, temperature, compositions mass and energy balances, physical limitations mass balances, sum toone, state variables Response derivatives Alternative domains Boundary conditions Extrapolation zone velocity profile model monotonicity, numerical properties, convexity Problem Space safe extrapolation, boundary conditions Add no slip 29

30 CARBON CAPTURE SYSTEM DESIGN Surrogate models for each reactor and technology used Discrete decisions: How many units? Parallel trains? What technology used for each reactor? Continuous decisions: Unit geometries Operating conditions: Vessel temperature and pressure, flow rates, compositions 30

31 BUBBLING FLUIDIZED BED Bubbling fluidized bed adsorber diagram Outlet gas Solid feed Cooling water CO 2 rich gas Model inputs (16 total) Geometry (3) Operating conditions (5) Gas mole fractions (2) Solid compositions (2) Flow rates (4) CO 2 rich solid outlet Model outputs (14 total) Geometry required (2) Operating condition required (1) Gas mole fractions (3) Solid compositions (3) Flow rates (2) Outlet temperatures (3) Model created by Andrew Lee at the National Energy Technology Laboratory 31

32 EXAMPLE MODELS ADSORBER Cooling water CO 2 rich gas Solid feed 32

33 SUPERSTRUCTURE OPTIMIZATION Mixed-integer nonlinear programming model Economic model Process model Material balances Hydrodynamic/Energy balances Reactor surrogate models Link between economic model and process model Binary variable constraints Bounds for variables MINLP solved with BARON 33

34 CONCLUSIONS ALAMO provides algebraic models that are Accurate Simple Generated from a minimal number of data points ALAMO s constrained regression facility allows modeling of Bounds on response variables Convexity/monotonicity of response variables Variable groups Built on top of state of the art optimization solvers ALAMO site: archimedes.cheme.cmu.edu/?q=alamo 34

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)