The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models

Similar documents
Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough

Lecture 9 Evolutionary Computation: Genetic algorithms

CSC 4510 Machine Learning

MEDIAN CONFIDENCE INTERVALS -GROUPING DATAINTO BATCHES AND COMPARISON WITH OTHER TECHNIQUES

Computational statistics

Local Search & Optimization

Evolutionary Computation

Chapter 8: Introduction to Evolutionary Computation

IE 303 Discrete-Event Simulation

Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature )

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Estimation-of-Distribution Algorithms. Discrete Domain.

Data Warehousing & Data Mining

Introduction to Optimization

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS

Solutions. Some of the problems that might be encountered in collecting data on check-in times are:

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS)

Implementation and performance of selected evolutionary algorithms

V. Evolutionary Computing. Read Flake, ch. 20. Genetic Algorithms. Part 5A: Genetic Algorithms 4/10/17. A. Genetic Algorithms

Genetic Algorithms. Donald Richards Penn State University

Genetic Algorithm. Outline

Scaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).

Genetic Learning of Firms in a Competitive Market

Evolutionary computation

Lecture 22. Introduction to Genetic Algorithms

Distribution Fitting (Censored Data)

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Evolutionary Algorithms

Genetic Algorithms: Basic Principles and Applications

Chapter 5 Evolution of Biodiversity

Evolutionary computation

I N N O V A T I O N L E C T U R E S (I N N O l E C) Petr Kuzmič, Ph.D. BioKin, Ltd. WATERTOWN, MASSACHUSETTS, U.S.A.

Lecture 15: Genetic Algorithms

Bioinformatics: Network Analysis

Crossover Techniques in GAs

1. Natural selection can only occur if there is variation among members of the same species. WHY?

Overall Plan of Simulation and Modeling I. Chapters

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

V. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

IV. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

Renewal Process Models for Crossover During Meiosis

DETECTING THE FAULT FROM SPECTROGRAMS BY USING GENETIC ALGORITHM TECHNIQUES

Reproduction- passing genetic information to the next generation

Risk of extinction and bottleneck effect in Penna model

MS-LS3-1 Heredity: Inheritance and Variation of Traits

EvolutionIntro.notebook. May 13, Do Now LE 1: Copy Now. May 13 12:28 PM. Apr 21 6:33 AM. May 13 7:22 AM. May 13 7:00 AM.

Darwin s Theory of Natural Selection

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia)

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms

Which model to use? How can we deal with these decisions automatically? Note flailing in data gaps and beyond ends for high M

Haploid & diploid recombination and their evolutionary impact

INVARIANT SUBSETS OF THE SEARCH SPACE AND THE UNIVERSALITY OF A GENERALIZED GENETIC ALGORITHM

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction

Lecturer: Olga Galinina

Local Search & Optimization

Genetic Algorithm for Solving the Economic Load Dispatch

How Species Form. 4.3 How Species Form. Reproductive Isolation

Parameter Estimation

Probability Distributions: Continuous

The Role of Crossover in Genetic Algorithms to Solve Optimization of a Function Problem Falih Hassan

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

Intelligens Számítási Módszerek Genetikus algoritmusok, gradiens mentes optimálási módszerek

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Genetically Generated Neural Networks II: Searching for an Optimal Representation

The European Applied Business Research Conference Rothenburg, Germany 2002

Statistics for Engineers Lecture 4 Reliability and Lifetime Distributions

2. Overproduction: More species are produced than can possibly survive

Genetic Algorithms & Modeling

Modeling and Performance Analysis with Discrete-Event Simulation

The Wright-Fisher Model and Genetic Drift

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

What is Natural Selection? Natural & Artificial Selection. Answer: Answer: What are Directional, Stabilizing, Disruptive Natural Selection?

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

A MULTI-OBJECTIVE GP-PSO HYBRID ALGORITHM FOR GENE REGULATORY NETWORK MODELING XINYE CAI

Koza s Algorithm. Choose a set of possible functions and terminals for the program.

how should the GA proceed?

Approximate Bayesian Computation

1.) The traits that help an organism survive in a particular environment are selected in natural selection. Natural Selection

IE 303 Discrete-Event Simulation L E C T U R E 6 : R A N D O M N U M B E R G E N E R A T I O N

Department of Mathematics, Graphic Era University, Dehradun, Uttarakhand, India

FAILURE-TIME WITH DELAYED ONSET

Genetic Algorithms. Seth Bacon. 4/25/2005 Seth Bacon 1

NGSS Example Bundles. 1 of 15

Boone County 8 th Grade Science Curriculum Map. Key Essential Questions:

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Local Search and Optimization

Statistical inference for Markov deterioration models of bridge conditions in the Netherlands

Introduction. Probability and distributions

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Linkage Identification Based on Epistasis Measures to Realize Efficient Genetic Algorithms

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Gene Pool Recombination in Genetic Algorithms

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials

Multi-Objective Optimization Methods for Optimal Funding Allocations to Mitigate Chemical and Biological Attacks

Parallel Genetic Algorithms

Multiple Choice Write the letter on the line provided that best answers the question or completes the statement.

Least Squares Classification

EVOLUTION change in populations over time

Simple Linear Regression for the Advertising Data

Transcription:

The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models Johann Christoph Strelen Rheinische Friedrich Wilhelms Universität Bonn Römerstr. 164, 53117 Bonn, Germany E-mail: strelen@cs.uni-bonn.de ASTC2003

The Genetic Algorithm is Useful to Fitting Input Probability Distributions for Simulation Models Modelling influences from outside of a stochastic discrete event model as random variates with a suited distribution Identification of this distribution How can the genetic algorithm be used for this purpose? 2

Problem Given: Data x 1, x 2,..., measured for a specific aspect of a real system Assumption: Can be modelled as independent realizations of a random variable Want to know: Which distribution? Solution in three steps: Step Classical method Genetic algorithm method 1. Which theoretical distribution? Graphical Weighted sum of distributions 2. Parameter values? Maximum likelihood Genetic algorithm 3. Fits the distribution? Goodness-of-fit tests (Objective function) Graphical methods: Histogram, quantile summaries (box plots) Theoretical distribution: Family of distributions with parameters, e.g. exponential, Weibull, lognormal. 3

Outline Five purposes, solved with the genetic algoritm (GA) Parameter estimation Selection of theoretical distributions Examples The genetic algorithm: Mimics the metaphor of natural biological evolution, the fittest individuals ( ˆ= tuples of parameter values) survive = optimization 4

Objective function Z(d) = [ ˆF (x 1 ) F d (x 1 )] 2 + [ ˆF (x 2 ) F d (x 2 )] 2 +... data x 1, x 2,... selected distribution function F d (x) parameter tuple d empirical distribution function ˆF (x) Measures accuracy, the smaller the better The genetic algorithm seaches for a minimum in the parameter space 5

The Genetic Algorithm Generations of populations are generated Population: Individuals Individual: Concatanated encoded parameter values, chromosome Grey code: Binary, given minimum and maximum, adjacent values differ in just one bit Fitness of a individual: Measure for the accuracy (objective function) After a specified number of generations, the individual with the best objective function value is the result. 6

The Genetic Algorithm(2) New Generation with new individuals: Generation gap: The fittest old individuals remain unchanged Selection: The others are selected randomly according to their fitness for breeding offspring Recombination of parts of two chromosomes(crossover) Mutation: Single bits change their state with some probability Reinsertion: The least fit parents are replaced with offspring 7

Purpose 1, Parameter Estimation Given data and a family of distributions Calculate parameter values such that given data are fitted The number of different parameters is not so crucial as with nonlinear equations 8

Example 1, Fitting Data Drawn from a Weibull Distribution 800 realizations, distribution function F (x) = 1 exp[ (x/3) 2 ] Fitted F (x) = 1 exp[ (x/3.031) 2.047 ] 200 generations, accuracy Z(2.047, 3.031) = 0.0293 9 Smooth curve: Fitted theoretical distribution function Scribbling curve: Empirical distribution function

Sometimes the distribution function has no closed form e.g. the Gamma or Lognormal distribution Calculate the distribution function F d (x) at the values x i, i = 1,..., n approximately with simple numerical integration from the density f d (x): F (x 1 ) = 1/n, i F (x i ) = (x j x j 1 )f d (x j 1 ), i = 2,..., n, j=2 F d (x i ) F (x i )/ F (x n ), i = 1,..., n. 10

Example 3, Gamma Distribution with Numerical Integration 800 realizations of a Gamma random variable Density f(x) = β α x α 1 exp[ (x/β)]/γ(α), α = 3 and β = 3 After 100 generations: Fitted density α = 3.01 and β = 2.94 Accuracy 0.031 11

Purpose 2, Similar to Purpose 1 but with Multi-Mode Distributions F (x) = p 1 F 1 (x) + p 2 F 2 (x) +..., p 1 + p 2 +... = 1, F 1, F 2,... same family of distributions but different parameter values 12

Example 2, Two-Mode Weibull Distribution 800 realizations, distribution function F (x) = 1 0.5 exp[ (x/3) 2 ] 0.5 exp[ (x/17) 5 ] Fitted F (x) = 1 0.51 exp[ (x/2.95) 2.05 ] 0.49 exp[ (x/16.98) 5.14 ] Accuracy 0.024 13

Purpose 3, Mixed Different Distributions Purpose 4, Falsification of a Theoretical Distribution One tries to fit a theoretical distribution to the data Objective function Z(d) remains large = the theoretical distribution is bad 14

Example 4, Fitting a Wrong Distribution 3200 realizations of a Weibull random variable Tried to fit these data with a Gamma distribution After 800 generations: Accuracy only 0.76 = Gamma distribution is not suited 15

Purpose 5, Automatic Selection of a Theoretical Distribution one tries to fit data with a mixed distribution F (x) = p 1 F 1 (x) + p 2 F 2 (x) +..., p 1 + p 2 +... = 1, F 1, F 2,... different theoretical distributions p 1 1, p 2 0, p 3 0,... = F 1 (x) is good Or some p i are significantly greater than zero = the according mixed distribution is good 16

Example 5, Decision between Gamma and Weibull Distribution 400 realizations of a Weibull random variable Tried to fit with a mixed Weibull and Gamma distribution function: F (x) = pf (Weibull) (x) + (1 p)f (Gamma) (x) After 400 generations: p = 0.998 accuracy 0.047 = Weibull is well suited 17

Comparing the Maximum-Likelihood Method with the Genetic Algorithm Method Genetic Algorithm Can be applied for the selection of a theoretical distribution Uniform for all theoretical distributions, only two variants: Closed form distribution function or numerically integrated density Many parameters no problem Hence multi-mode and mixed distributions straightforward vs. Maximum-Likelihood Method Other technique must be used (e.g. graphical) For each theoretical distribution different nonlinear equations Difficult Difficult Future work: Genetic algorithms for dependent data, stochastic processes. 18