Parametrized Genetic Algorithms for NP-hard problems on Data Envelopment Analysis Juan Aparicio 1 Domingo Giménez 2 Martín González 1 José J. López-Espín 1 Jesús T. Pastor 1 1 Miguel Hernández University, 2 University of Murcia Spain The Santander Chair 2015 International Workshop of Efficiency and Productivity, Alicante, June 12, 2015
Outline 1 Data Envelopment Analysis 2 Valid Solutions 3 Genetic algorithm 4 Hybrid metaheuristics 5 Conclusions and future works
DEA (Data Envelopment Analysis): non-parametric technique to estimate the level of efficiency of a set of entities, DMU (Decision Making Unit), all of them operating in the same technological environment. Each DMU j consumes m inputs, denoted as (x 1j,..., x mj ), to produce s outputs, denoted as (y 1j,..., y sj ). DEA also provides information on how to remove inefficiency through the determination of benchmarking information. Objetive: the estimation of the production frontier and the technical efficiency of each DMU (the distance from each interior DMU to the boundary of the technology).
Model of mathematical lineal programming (Aparicio et al., 2007) t ik x ik max β k 1 m m i=1 s.t. β k + 1 s t + rk s r=1 = 1 (c.1) y rk β k x ik + n j=1 α jkx ij + t ik = 0 i (c.2) β k y rk + n j=1 α jky rj t + rk = 0 r (c.3) m i=1 ν ikx ij + s r=1 µ rky rj + d jk = 0 j (c.4) ν ik 1 i (c.5) µ rk 1 r (c.6) d jk Mb jk j (c.7) α jk M(1 b jk ) j (c.8) b jk = 0, 1 (c.9) β k 0 (c.10) t ik 0 i (c.11) t + rk 0 r (c.12) d jk 0 j (c.13) α jk 0 j (c.14) It must be solved n times, one for each DMU.
Approaches to the problem Problem: combinatorial NP-hard problem, solved with unsatisfactory methods. Exact solutions only for small problem sizes. Possible solution: Metaheuristic algorithms. The main problem to apply metaheuristics is the difficulty of obtaining solutions satisfying all the constraints: In 2014, reduced problems with less than 14 constraints. Now, all the constraints and generation of a higher percentage of valid solutions, with a Genetic Algorithm, and improvement of the solutions with hybrid metaheuristics.
Representation of solutions A solution is represented by a vector of real and binary values. Binary part: b 0k... b jk Real part: β k α 0k... α jk t 0k satisfying the 14 constraints.... t t +... t + ik 0k rk fitness: Value returned by the objective function. β k 1 m m t ik x ik i=1 Initialization: use of heuristics to generate valid solutions.
First heuristic 1 Generate b jk j (c.9). Restrictions: number of b jk equal to 0, > s and < s + m. 2 Calculate the values of α jk and d jk j by means of a system of equations. 3 t + rk r and β k are generated to satisfy c.1, with a refinement process: Generate r, t + rk randomly between 0 and 1; Obtain β k using c.1. while β k 0 OR β k 1 do {Local Search on t + } if β k < 0 then Generate r randomly, and t + rk = t+ rk /(2.0 + random(0, 1, 2)) else Generate r randomly, and t + rk = t+ rk (2.0 + random(0, 1, 2)) end if Obtain β k using c.1. end while 4 α jk j are calculated using c.3 by solving the system of equations. 5 t ik calculated using c.2. by solving the system of equations. 6 Finally, ν ik i generated randomly, µ rk r obtained from c.4, and number of d jk equal to 0 the same as number of α 0.
Second heuristic used to recalculate non valid solutions after the first heuristic 1 b jk j generated as in heuristic one; values α generated randomly. 2 α jk j modified to satisfy c.1, c.2., c.3., c.11. and c.12. {Local Search on α} for i = 1,..., m do if x ik < n j=1 α jkx ij then j 0 / 1 m m i=1 x ij 0 1 s s i=1 y ij 0 = max j=1,...,n { 1 m m i=1 x ij 1 s s i=1 y ij } α j0 k = α j0 k 0.95 end if end for for r = 1,..., s do j 0 /... α j0 k = α j0 k 1.05 end for j adjust α jk with a similar refinement method. Adjust β k to satisfy c.11. and c.12. Obtain t + rk r and t ik i using c.2. and c.3. 3 Similar refinement (LS) to do β k satisfy c.2., c.3., c.11. and c.12. 4 ν ik i, µ rk r and d jk j as in the first method.
Percentage of valid solutions size 9 constraints - ICCS14 13 constraints - ICAC14 14 constraints m n s time (sec) % val. time (sec) % val. time (sec) % val. 2 15 1 26.42 51.44 82 35.58 33.21 10.82 72 18.12 0.09 0.02 100 0.00 3 25 2 6.72 16.03 90 30.46 72.89 15.56 24 20.97 0.88 0.68 96 2.85 4 30 2 0.22 0.16 100 0.00 89.84 18.63 16 21.13 0.88 1.74 95 1.49 5 40 3 13.13 20.64 74 43.40 116.39 12.86 1.6 2.49 27.22 42.38 92 9.07 6 60 4 2.01 1.13 35 44.07 117.26 14.15 0.06 0.10 93.46 70.08 53 35.57 Now higher percentage of valid solutions and for all the constraints apply metaheuristics to improve solutions.
Initialization: with the heuristics. End Condition: a maximum number of iterations or a maximum number without improving the best solution. Selection: valid solutions are selected for combination. Non-valid solutions are substituted for new valid solutions. Crossover Individual with components of six types, each combination works with one of these types. 1 Only β is considered. The mean of β 1 and β 2 of the two ascendants is obtained and randomly perturbed. The values of t ik and t+ rk are recalculated so that constraints c.1, c.2 and c.3 are fulfilled. 2 Values of t +, t, ν, µ or d are crossed. In each combination only parameters of one type randomly selected, with middle point combination. 3 Combination of the previous crossovers. All the parameters are candidates, and one is randomly selected. Mutation: each individual a 10% probability of being mutated. One parameter is selected randomly, and new values are randomly generated.
Comparison with CPLEX Fitness Time (logarithmic scale) 0.7 0.6 m=4,n=30, s=3 fitness 0.5 0.4 0.3 0.2 CPLEX crossover 1 crossover 2 crossover 3 0.1 0 0 5 10 15 20 25 30 iterations Small problems: solutions with GA close to those with CPLEX. Large problems: CPLEX impracticable.
Parameterized scheme Initialize(S,ParamIni) while not EndCondition(S,ParamEnd) do SS = Select(S,ParamSel) SS1 = Combine(SS,ParamCom) SS2 = Improve(SS1,ParamImp) S = Include(SS2,ParamInc) end while Different values of the Metaheuristic parameters different metaheuristics and hybridizations.
Applications Successfully applied to: Signal filter design. Electricity consumption in exploitation of water-wells. Estimation of parameters in kinetic reactions. Maximum diversity problem. p-hub problem. Tasks-to-processors assignation. Simultaneous Equation Models. Molecules docking.
Metaheuristics in the experiments And Hyperheuristic implemented with the same parameterized scheme and searches for the best combination of metaheuristic parameters in the scheme.
Mean fitness Comparison of fitness Promedio Fitness 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 m=2 s=1 N=50 m=3 s=2 N=30 m=4 s=2 N=28 m=4 s=3 N=20 m=5 s=3 N=20 Tipo Problem de problema size CPLEX Hiperheuristic SS GA GR
Conclusions Application of Genetic algorithms and hybrid metaheuristics for a mathematical programming model for Data Envelopment Analysis. The results of previous works are improved: all the constraints are considered, and larger number of valid solutions are generated. Small problems: metaheuristics give fitness values close to the optimum, and hyperheuristics can be used to obtain satisfactory hybrid metaheuristics. Metaheuristics can be applied for large problems, for which huge execution times make exact methods impracticable.
Future works Improvement of heuristics to generate valid solutions. Hybridization of metaheuristics and exact methods, and exploration of the use of heuristics in CPLEX. Improvement of the hyperheuristic. Parallelism to reduce the high execution time of metaheuristics, and specially of hyperheuristics.