An Evaluation Framework for the Comparison of Fine-Grained Predictive Models in Health Care

Similar documents
Robust Multi-Objective Optimization in High Dimensional Spaces

Multiobjective Optimization

Research Article A Novel Ranking Method Based on Subjective Probability Theory for Evolutionary Multiobjective Optimization

Behavior of EMO Algorithms on Many-Objective Optimization Problems with Correlated Objectives

Quad-trees: A Data Structure for Storing Pareto-sets in Multi-objective Evolutionary Algorithms with Elitism

Monotonicity Analysis, Evolutionary Multi-Objective Optimization, and Discovery of Design Principles

Generalization of Dominance Relation-Based Replacement Rules for Memetic EMO Algorithms

TIES598 Nonlinear Multiobjective Optimization A priori and a posteriori methods spring 2017

A Non-Parametric Statistical Dominance Operator for Noisy Multiobjective Optimization

A Multiobjective GO based Approach to Protein Complex Detection

MULTIOBJECTIVE EVOLUTIONARY ALGORITHM FOR INTEGRATED TIMETABLE OPTIMIZATION WITH VEHICLE SCHEDULING ASPECTS

Runtime Analyses for Using Fairness in Evolutionary Multi-Objective Optimization

Comparative Study of Basic Constellation Models for Regional Satellite Constellation Design

A Brief Introduction to Multiobjective Optimization Techniques

Relation between Pareto-Optimal Fuzzy Rules and Pareto-Optimal Fuzzy Rule Sets

DESIGN OF OPTIMUM CROSS-SECTIONS FOR LOAD-CARRYING MEMBERS USING MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS

Solving High-Dimensional Multi-Objective Optimization Problems with Low Effective Dimensions

Comparison of Multi-Objective Genetic Algorithms in Optimizing Q-Law Low-Thrust Orbit Transfers

Effects of the Use of Non-Geometric Binary Crossover on Evolutionary Multiobjective Optimization

A Posteriori Corrections to Classification Methods.

On cone based decompositions of proper Pareto optimality

Exam Machine Learning for the Quantified Self with answers :00-14:45

Covariance Matrix Adaptation in Multiobjective Optimization

Resampling techniques for statistical modeling

QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS

Multiobjective Optimization of Cement-bonded Sand Mould System with Differential Evolution

Interactive Evolutionary Multi-Objective Optimization and Decision-Making using Reference Direction Method

Multi-objective approaches in a single-objective optimization environment

Measuring and Optimizing Behavioral Complexity for Evolutionary Reinforcement Learning

TECHNISCHE UNIVERSITÄT DORTMUND REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531

Włodzimierz Ogryczak. Warsaw University of Technology, ICCE ON ROBUST SOLUTIONS TO MULTI-OBJECTIVE LINEAR PROGRAMS. Introduction. Abstract.

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. XX, NO.X, XXXX 1

Efficiently merging symbolic rules into integrated rules

PIBEA: Prospect Indicator Based Evolutionary Algorithm for Multiobjective Optimization Problems

Robustness criterion for the optimization scheme based on kriging metamodel

MULTIOBJECTIVE DESIGN OF VACCINATION CAMPAIGNS WITH A STOCHASTIC VALIDATION

Maximum Likelihood Diffusive Source Localization Based on Binary Observations

Multiobjective Optimisation An Overview

Deposited on: 1 April 2009

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Analyses of Guide Update Approaches for Vector Evaluated Particle Swarm Optimisation on Dynamic Multi-Objective Optimisation Problems

Prediction-based Population Re-initialization for Evolutionary Dynamic Multi-objective Optimization

Analysing repeated measurements whilst accounting for derivative tracking, varying within-subject variance and autocorrelation: the xtiou command

Plateaus Can Be Harder in Multi-Objective Optimization

VECTOR REPETITION AND MODIFICATION FOR PEAK POWER REDUCTION IN VLSI TESTING

Automated Segmentation of Low Light Level Imagery using Poisson MAP- MRF Labelling

Numerical Methods. King Saud University

An Introduction to GLIF

OPTIMIZING BUILDING DESIGNS USING A ROBUSTNESS INDICATOR WITH RESPECT TO USER BEHAVIOR

Agent-Based and Population-Based Modeling of Trust Dynamics 1

Multi-Objective evolutionary algorithm for modeling of site suitability for health-care facilities

Algorithmic Probability

PaCcET: An Objective Space Transformation to Shape the Pareto Front and Eliminate Concavity

ARTIFICIAL NEURAL NETWORK WITH HYBRID TAGUCHI-GENETIC ALGORITHM FOR NONLINEAR MIMO MODEL OF MACHINING PROCESSES

Q-Matrix Development. NCME 2009 Workshop

PaCcET: An Objective Space Transformation to Iteratively Convexify the Pareto Front

Solving High-Dimensional Multi-Objective Optimization Problems with Low Effective Dimensions

On the static assignment to parallel servers

Basic Verification Concepts

TECHNISCHE UNIVERSITÄT DORTMUND REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531

Optimization of Feeding Source for an Electromagnetic Shooter

Multiobjective Optimization of an Extremal Evolution Model

Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations

GDE3: The third Evolution Step of Generalized Differential Evolution

Computationally Expensive Multi-objective Optimization. Juliane Müller

Shift Scheduling in Pediatric Emergency Medicine

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Upper bound for optimal value of risk averse multistage problems

Fundamentals of Similarity Search

Multiobjective Evolutionary Algorithms. Pareto Rankings

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Expected Running Time Analysis of a Multiobjective Evolutionary Algorithm on Pseudo-boolean Functions

Heuristics for The Whitehead Minimization Problem

Multi-Objective Optimization of Two Dynamic Systems

Fast and Accurate HARDI and its Application to Neurological Diagnosis

Bi-objective Portfolio Optimization Using a Customized Hybrid NSGA-II Procedure

Quantifying uncertainty on Pareto fronts with Gaussian Process conditional simulations

New Reference-Neighbourhood Scalarization Problem for Multiobjective Integer Programming

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Gamma process model for time-dependent structural reliability analysis

arxiv: v1 [math.st] 1 Dec 2014

3.1 Arrow s Theorem. We study now the general case in which the society has to choose among a number of alternatives

Running time analysis of a multi-objective evolutionary algorithm on a simple discrete optimization problem

Necessary Corrections in Intransitive Likelihood-Ratio Classifiers

Interactive Evolutionary Multi-Objective Optimization and Decision-Making using Reference Direction Method

An Evolutionary Strategy for. Decremental Multi-Objective Optimization Problems

Forecasting Wind Ramps

Multi Objective Optimization

STATISTICS Relationships between variables: Correlation

Multiobjective Optimization of the Production Process for Ground Granulated Blast Furnace Slags

Research Article Optimal Solutions of Multiproduct Batch Chemical Process Using Multiobjective Genetic Algorithm with Expert Decision System

Stochastic processes and

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

On Constrained Boolean Pareto Optimization

Appendix: Simple Methods for Shift Scheduling in Multi-Skill Call Centers

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

The RMS US inland flood model

Bayesian Reasoning and Recognition

Minimizing response times and queue lengths in systems of parallel queues

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

Transcription:

An Evaluation Framework for the Comparison of Fine-Grained Predictive Models in Health Care Ward R.J. van Breda 1 ), Mark Hoogendoorn 1,A.E.Eiben 1, and Matthias Berking 2 1 VU University Amsterdam, Department of Computer Science De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands, {w.r.j.van.breda,m.hoogendoorn,a.e.eiben}@vu.nl 2 Friedrich-Alexander University Erlangen-Nuremberg, Germany matthias.berking@fau.de Abstract. Within the domain of health care, more and more fine-grained models are observed that predict the development of specific health or disease-related) states over time. This is due to the increased use of sensors, allowing for continuous assessment, leading to a sharp increase of data. These specific models are often much more complex than high-level predictive models that e.g. give a general risk score for a disease, making the evaluation of these models far from trivial. In this paper, we present an evaluation framework which is able to score fine-grained temporal models that aim at predicting multiple health states, considering their capability to describe data, their capability to predict, the quality of the models parameters, and the model complexity. 1 Introduction Predictive modeling is of utmost importance in health care and has potential to be the basis for prevention strategies, especially, early and highly personalized interventions. Predictive models can vary greatly in their level of granularity, ranging from relatively coarse-grained models, that e.g. provide a general risk disease score, to highly fine-grained ones that predict detailed developments of disease and/or disease-relevant states over time. The latter is received more and more attention due to the improvements in measurement capabilities. The evaluation of fine-grained models is unfortunately far from trivial as they often are of a temporal nature, predict multiple states, have parameters with which the model can be personalized, et cetera. In order to make informed decisions during model development, or when comparing models, there is a need to understand how model quality can be evaluated in a rigorous way. In this paper, we present an evaluation framework for fine-grained predictive models. The framework considers the following aspects: 1) the descriptive capability; 2) predictive capability; 3) parameter sensitivity, and 4) model complexity. The weight of each of these criteria can be set according to the characteristics of the specific disease or health aspects under consideration. c Springer International Publishing Switzerland 2015 J.H. Holmes et al. Eds.): AIME 2015, LNAI 9105, pp. 148 152, 2015. DOI: 10.1007/978-3-319-19551-3_18

Evaluation Framework for Fine-Grained Predictive Models 149 2 Evaluation Framework Scope of Framework. The framework we describe is meant for domains with temporal data over a number of attributes a 1,...,a m for a number of patients b 1,...,b n. Attributes represent an aspect of the health state of a patient, and we denote the domain of a i by A i.atanygivetimet, the state of a patient b is a vector sb, t) A 1... A m. To designate the specific value of an attribute we use the notation sb, t, i) A i. We assume a dataset that contains a state for each patient for a number of time instances t 1,...,t end. These measured data are contained in the matrix Z, where at any give time t j j =1,...,end) the observed state of patient b is a vector zb, t j ) A 1... A m. Furthermore, we consider models composed from rules for state transition and assume one rule for each of the given attributes that describes the value of a i t + 1) based on a i t) and possibly some other attributes. Formally, we have r i : A 1... A m A i. Obviously, a transition rule for attribute i may not need all other attributes, only a few of them; in the extreme case only a i itself. AmodelM is then a composed entity that can predict the consecutive states of any give patient. Thus, M : A 1... A m A 1... A m and given a state sb, t) ofpatientb at time t we can denote the predicted state at time t +1by Msb, t)). Each model is equipped with parameters, a model instance is a model for which a value has been assigned to each parameter denoted as M p where p P and P is the set of possible parameter value vectors. To employ a model M for predicting a full sequence of states for a patient b it needs to be applied to the first observed state zb, t 1 ) and then iteratively to the outcomes. To avoid an unnecessarily heavy notation, we define these predicted states interactively as follows. Given a patient b and a model instance M p, the predicted state at the start is the observed state s Mp b, 1) = zb, 1) and for all t =1,..., end 1 s Mp b, t +1)=M p s Mp b, t)) The goal of the model instance is to minimize the difference between the predicted and observed states. The error of a model M for a patient b over the full observation period is then eb, M p )= t end Dzb, t),s Mp b, t)) t=t 1 where D is some application specific measure of difference between two states. Furthermore, we define eb, M p,i) defines the error the model makes on an attribute i. Throughout this paper, the Mean Squared Error MSE) is assumed to be the error measure. Descriptive Capability. To express the quality of a model M for a given patient b we consider the error eb, M, a i ) on each attribute a i.thisimpliesa multi-objective optimization MOO) problem, where each objective corresponds to one attribute. In the sequel we assume that we have and use a multi-objective

150 W.R.J. van Breda et al. optimization algorithm that searches in the space of model instances M p based on a training set. For examples of MOO algorithms, see [2], [3]. Given a patient b, the output of one run of the algorithm is a set of non dominated model instances, where dominance is defined in the usual manner: M p dominates M p,ifforeach attribute a i the error eb, M p,i) is lower or equal than eb, M p,i)andthereis at least one attribute a j where the error eb, M p,j)islowerthaneb, M p,j). Because each model instance corresponds to one point in the space of model parameter vectors, the set non-dominated model instances forms a Pareto front in this vector space. 1 Due to the typical stochastic nature of the algorithms to find such assignments it is assumed that multiple runs of the algorithm are performed per patient. Each run r delivers a Pareto front of q non-dominated model instances {M p1,r,..., M pq,r } and each model instance M pi,r has a corresponding m-dimensional error vector eb, M pi,r, 1),..., eb, M pi,r,m), where m is the number of attributes. Each of these q vectors can be plotted using an attainment surface [1].Givensuchanattainmentsurfaceitsdominatedhypervolumecanbecalculated, which is the volume above each attainment surface, related to an error reference point, set to 1 for each objective, based on the assumption that all error values are scaled to an interval of [0,1]. For more details, see [1]. We use the notation nh M b, r) to denote the dominated hypervolume for run r of the MOO algorithm to optimize model M for patient b. Executing r max runs of the MOO algorithm, we obtain r max attainment surfaces and r max values of nh M b, r). Taken all runs into account we now can define model quality by averaging the values of the non-dominated hypervolumes: nh M b) = rmax r=1 nh M b,r) r max Each patient has its own unique value, and we therefore end up with a vector of n values: nh M b 1 ),..., nh M b n ). The final score of the criterion is then defined as multiplication of the mean μ nhm with 1 minus the standard deviation σ nhm of the values in this vector since a high mean is good in combination with a small standard deviation: descriptive score M = μ nhm 1 σ nhm ) Predictive Capability. Next to a good descriptive capability, we also want the model to perform well on the test set, i.e. have a good predictive capability. Hereto, we define two measurements: 1) we define the absolute predictive performance on the test set, and 2) we quantify the relationship between how well the model performs on the training and test set, we would prefer these to go hand-in-hand and call this the relative predictive performance. Concerning absolute predictive performance, for a model M we calculate the mean μ em j) and the standard deviation σ em j) over the set of errors belonging to attribute j for the q model instances per run and r max runs, for all patients i.e. 1 to n), resulting in a set size of q r max n errors. We then take the average of the mean 1 The dimensionality of this vector space depends on the number of model parameters.

Evaluation Framework for Fine-Grained Predictive Models 151 and standard deviation over all m attributes: μ em and σ em. Then, the absolute predictive performance score becomes: absolute pred score M =1 μ em )1 σ em ) To measure the relation between the errors on training and test set, for the q model instances per run and r max runsweendupwithatotalofq r max model instances with specific parameter vectors per patient b. For each attribute j we determine the correlation between the error of each of the q r max model instances on the training and the test set for each individual patient whereby the training set is the first period of data and the test set the second, later, period): e train b, M pi,r,j)ande test b, M p1,r,j). Hereto, we use the correlation note training and test to tr and te respectively) for a specific model M: cor Mb, j)= q rmax e tr b,m pi,r,j) e tr b,m pi,j) e tr b,m pi,r,j) e tr b,m pi,j) ) 2 q ) ) e teb,m pi,r,j) e teb,m pi,j) rmax ) 2 e teb,m pi,r,j) e teb,m pi,j) Then, we calculate the average μ corm across the set of all correlations of all patients i.e. 1 to n) and criteria i.e. 1 to m): cor M b 1,a 1 ),..., cor M b n,a 1 ),..., cor M b n,a m ) as well as the standard deviation σ corm ) 2 : relative pred score M = maxμ corm, 0)1 σ corm ) Given weights w 1 and w 2 a final evaluation score for model M is calculated: predictive score M = w 1 absolute pred score M + w 2 relative pred score M Parameter Sensitivity. The parameter sensitivity is the most complex metric. In the current version we keep it as simple as possible. We want to avoid meaningless parameters, that do not have any influence on the performance of the model. Therefore we look at the relationship between parameters and the various evaluation objectives. Assuming we define p i,r,b u) as the value for parameter u for model instance i from during run r of patient b, we define correlation between a parameter and the resulting error on an attribute j for model M as follows: cor M b, j, k) = e trb,m pi,r,b,j) e tr b,m p,j))p i,r,b k)) p b k)) e trb,m pi,r,b,j) e tr b,m p,j)) 2 p i,r,b k)) p b k)) 2 If a parameter always has a correlation close to zero for different evaluation criteria and patients) this is considered bad. Thus, we look for the maximum of the absolute value of the correlation found in the set of all patients and criteria, which gives an indication whether a parameter has a use i.e correlation) in at least one patient/model instance combination. We define a correlation as useless or weak) if it falls under a boundary of 0.35 [4]. { 1 maxb [1,n],j [1,m] cor useful M,pk) = M b, j, k) ) 0.35 0 otherwise 2 Note that with respect to the mean a cutoff point of 0 is used via the max operator) as we consider all correlations below 0 equally bad.

152 W.R.J. van Breda et al. Finally, we calculate the fraction of useless parameters in a model as its score for the parameter sensitivity where P is the number of elements in the parameter vector and is the highest parameter number of the model): sensitivity score M = P useful M,pk)) k=1 P Model Complexity. The model complexity counts the number of states and parameters: A complexity score M =1 M + P M ) max mo Models A mo + P mo ) The score is scaled according to the maximum complexity of the models that are subject to evaluation. 3 Discussion This paper has introduced a framework which is able to evaluate fine-grained temporal predictive models for health care. Hereto, several criteria have been introduced which can be combined by taking a weighted sum of the different scores, thereby selecting weights depending on the importance of the criterion for the case at hand. Initial experiments suggest that the framework is able to generate important insights in the properties of the models. For future work we want to test and refine the framework by further investigating the usefulness and performance of the different criteria and related metrics. Furthermore, we want to use the framework as a fitness function for automatically generating predictive dynamic models using genetic programming techniques. Acknowledgements. This research has been performed in the context of the EU FP7 project E-COMPARED project number 603098). References 1. Deb, K.: Multi-objective optimization using evolutionary algorithms, vol. 16. John Wiley & Sons 2001) 2. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 62), 182 197 2002) 3. Mlakar, M., Petelin, D., Tušar, T., Filipič, B.: Gp-demo: Differential evolution for multiobjective optimization based on gaussian process models. European Journal of Operational Research 2014) 4. Taylor, R.: Interpretation of the correlation coefficient: a basic review. Journal of Diagnostic Medical Sonography 61), 35 39 1990)