S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA

Size: px
Start display at page:

Download "S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA"

Transcription

1 S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services, Infosys Limited, Electronic City, Hosur Road, Bangalore, India 1

2 Contents: GPU Based Stacking Sequence Generation For Composite Skins Using GA Overview Composite Skin Engineering - Overview. Optimization of Aircraft Composite Skins Part1 Assumptions Decision variables, Objective, constraints Genetic Algorithms Based Composite Part2 Stacking Sequence Generation Approach Encoding and Decoding, Initial solutions generation Improving the solutions :Genetic Algorithm Operators Convergence Criteria Speedup using GPU Part3 Graphics processing Unit(GPU) and data parallel applications Method of using CUDA based GPU and Flow chart Speed up achieved and Observations. Stacking Sequence Results for three typical examples. Future work and Conclusion ( Pictures in this page are for illustration only ) 2

3 Part1 : Composite Skin Engineering 3

4 Composite Laminate Skins- Overview Composites relevance to Aircraft industry. Plies, fiber orientations, laminate, zones Fiber Orientation 45 o ply -45 o ply Zones o ply 90 o ply Laminate Stacking (45 o /-45 o /0 o /90 o ) Plan view of Aircraft Wing Skin 18 4

5 Optimization of Aircraft Composite Skins Optimization of real life aircraft composite skins is performed in two stages. In First Stage- a gradient based optimization technique In Second Stage, stacking sequence generation - the scope of current study. Objectives of the current study are : demonstrate utility of genetic algorithms for the problem showcase performance benefits of parallel computing using GPUs. 5

6 Problem Formulation Assumptions first level composite skin optimization is already performed. Constraints The stacking sequence generation is subjected to the following stacking rules. S. No. Stacking Sequence Rule 1 Laminate should be symmetric. 2 Number of plies of each orientation should remain same. 3 Laminate stacking sequence should not contain more than 4 plies in the same orientation together. 4 Laminate stacking sequence should not have more than 2 plies in the same orientation together at the top of the laminate. 5 Maximum difference in angle orientations between two consecutive plies must be equal to At the top of the laminate 0 0 ply should be placed such that there are at least 3 plies between 0 0 ply and outer surface of the laminate. 6

7 Mathematical formulation Cond. Objective function Minimize Violation of stacking rules, i.e.: Minimize Penalty P = P 1 + P P n Where P 1, P 2,..P n are penalties for non-compliance of 1, 2, n th stacking sequence rule Decision Variables For each layer of laminate, assign one of the orientations: (0 0, 90 0, +45 0, ) 7

8 Part2 : Genetic Algorithms Based Composite Stacking Sequence Generation Approach 8

9 Genetic Algorithms Genetic algorithms are search techniques based on principles of natural selection. Methods are suitable for combinatorial problems like stacking sequence generation. Capable of generating good solutions by evaluating a fraction of solutions among all possible options No need of gradient information of the objective function, so oblivious to the domain of the problem. The genetic algorithm process starts by assigning allowable ply orientations 0 0, 45 0, 90 0 and randomly to design variables. The fitness (reciprocal of number of rule violations) for this solution is computed. 9

10 GA Steps A Simple Genetic algorithm has the below steps: Step1. Encoding, decoding and Initial solutions creation: random 1 s and 0 s Step2. Improving the initial solutions: Selection, Crossover, Mutation Step3. Convergence and Stopping criterion: 85-90% similarity in solutions, or reaching preset maximum number of iterations. First Generation Second Generation Last Generation Step1: Encoding, decoding and initial solution Solution1 Solution2 Solution3 Solution4.... Solution N Fitness Step2: Improving Initial solutions using Selection, Cross Over, Mutation Selection, Cross Over, Mutation Selection, Cross Over, Mutation Step3 : Convergence

11 GA Step1: Encoding, decoding and Initial solutions creation Design Variables are encoded into bits of 1 s and 0 s In the current problem Design variables are the ply orientations at each level of the laminate Each ply orientation is encoded with two bits as given below Ply Orientation Encoded Bits 0 o o o o 11 The Objective function in the current problem, i.e. number of stacking rule violations, is translated into the fitness of an individual solution. Problem Space Laminate Stacking Sequence (90/45/90/0/-45/0) Evaluate Solution Encoding Decoding Solution Space Encoded Laminate Stacking Sequence: Sol 1: Sol 2: Sol n:... 11

12 GA Step2 : Improving the initial solutions :GA Operators Three operators : 1. selection, 2. cross-over 3. mutation 1. Selection : Selection operator is based on the survival of the fittest i.e. each solution gets number of copies into new solution space, in proportion to its fitness. 2. Cross-Over Operator: Cross over operators picks up two solutions at random within a generation and, between these solutions does swapping of bits from one to another, to create two new solutions. Least Fit solution occupies smallest segment on the wheel 5% 35% Parent1 Cross Sites Selection Point Fittest solution occupies more area on the wheel 40% 12% 8% Wheel rotation Parent2 Offspring1 Offspring2 Cross Over 3. Mutation Operator: Mutation randomly flips bits in solution according to a preset probability. This operator increases the chances of avoiding local minimum by keeping the population diverse to a minimum extent. Usually the probability of mutation is low e.g Mutation 12

13 GA Step3: Convergence Criteria Maximum number of iterations reaches a prefixed number % of similarity in solutions in the current generation is reached. 13

14 Scope for Parallelization Computations across generations are dependent : can t be parallel. First Generation Second Generation Last Generation Computations within a generation are independent of each other: Can be parallel. Step1: Encoding, decoding and initial solution Solution1 Solution2 Solution3 Solution4.... Solution N Fitness Step2: Improving Initial solutions using Selection, Cross Over, Mutation Selection, Cross Over, Mutation Selection, Cross Over, Mutation Step3 : Convergence Parallelization can be achieved using: 1. Multiple CPUs : expensive, limited cores 2. GPGPUs : becoming less expensive, commonly used for graphics processing, considered in this study 14

15 Part3 : Utilizing GPU power 15

16 GPUs Introduction A GPU is an additional computational device for a computer, in addition to CPU to perform faster computations. GPUs are designed to perform thousands of massively parallel computations and are traditionally used for large scale matrix operations. The serial code needs to be parallelized using the GPU specific languages like CUDA(NVIDIA s GPU API) used in this study OPENCL(platform neutral). 16

17 Steps in Programming Using a GPU Step CPU task GPU related task 1 Declare pointers to host(which is another name for CPU) data Declare pointers to device(another name for GPU) data 2 Allocate host pointers with Malloc Allocate device pointers with CUDAMalloc 3 Populate input data pointers of host. Copy data from host to device using CUDAMemcpy(with parameter HostToDevice). 4 Specify number of blocks and number of threads (kernel configuration). 5 Specify the kernel code (code to be run on device for each thread) and make a call to kernel code. 6 Perform processing on device 7 Copy the result data from device to host using CUDAMemcpy (with parameter DeviceToHost). 8 If required post process the result on host and present the results to user. 17

18 Flow Chart Start Allocate host and device pointers. Populate the Population data and other host data. Copy data from host to device using CUDAMemcpy(with argument HostToDevice). Launch as many threads on Device as there are individuals in populations and number of blocks equal to by number of zones. Call kernel function with parameters such as pointers to matrices of initial population, pointers to output data, genetic parameters (like number if points of cross over), stiffness information of plies and their orientations. Each thread is meant to compute fitness of one typical solution of stacking sequence and performs Genetic algorithm iterations. Fitness computation: Penalized objective function is computed based on constraint violations. Routines that run on GPU Converged? No Selection Yes Copy computed results from Device to Host, using CUDAMemcpy (with argument DeviceToHost) Write solution to an output file. Stop Cross Over Mutation Positions of calls to syncthreads (). 18

19 Stacking Sequence Generation Problems and Results 19

20 Comparison of Execution time with CPU alone and with GPU Programming. CPU Used Intel Xeon Processor with 2GB RAM, 2.53GHz clock rate Problem Number of Zones GPGPU Used 1.3 NVIDIA Tesla T10 Processor with 4GB Global Memory 30 Multi-processors 240 Cores, 16K Shared Memory per block, 32 Size warps, 512 thread per block, 1.3GHz clock rate CPU Alone (Seconds) CPU + GPU (Seconds) Performance benefit Composite Skin Problem times Composite Skin Problem times Composite Skin Problem times Observations : 1. Speed-up of up to 8 times were observed if GPU computation is used. 2. As the size of problem increases the performance benefit is higher because the full power of GPU is utilized. 20

21 Composite Skin Problem1 150mm mm This composite skin as shown contains four zones with the number of plies, initial thickness law and initial stacking sequence as shown. Each of the ply thickness is considered as 0.125mm which is same for all the composite skins analyzed in the current work. Performance benefit CPU Alone (Seconds) CPU+GPU (Seconds) 4 3 Performance benefit times Initial Stacking Zone Number of Number Plies Initial Stacking Sequence (un-optimized) 1 40 (0 10 /45 10 /90 10 / ) 2 30 (0 12 /45 6 /90 6 /-45 6 ) 3 24 (0 8 /45 6 /90 4 /-45 6 ) 4 10 (0 4 /45 2 /90 2 /-45 2 ) Generated Stacking Sequence Zone Stacking Sequence 1 [-45 2 / 90/ 45/ 0/ 45/ 0/ 90/ 0/ 45/-45/ 45 2 /-45/ 0/ 90/ 0/ 90 2 /-45] s 2 [45/ 90/ 45/-45/ 45/-45/ 0/ 90/ 0 2 / 0 2 / 90/ 0/-45] s 3 [90/ 45/ 90/-45/ 0 2 / 45/-45/ 0/ 45/ 0/-45] s 4 [-45/ 45/ 90/ 0 2 ] s Best Zone 1 Stacking Sequence Solution 1 [-45 2 / 90/ 45/ 0/ 45/ 0/ 90/ 0/ 45/-45/ 45 2 /-45/ 0/ 90/ 0/ 90 2 /-45] s 2 [90 3 /-45/ 0/ 45/ 0 2 / 45/ 0/ 90/ 45/ 0/ 45/-45/ 90/-45/ 45/-45 2 ] s 3 [90/ 45/ 90/ 45/-45/0/ 45/0/ 45/0/ 90/ 45/-45/90/0 2 /90/-45 3 ] s Rule violated None

22 Composite Skin Problem2 500mm mm This tapered composite skin has 6 zones with 2 zones having the same thickness. The geometry and thickness law for this composite skin are shown. Performance benefit CPU Alone (Seconds) CPU+GPU (Seconds) Performance benefit times 75mm Initial Stacking Zone Number Number of Plies Thickness Law & Initial Stacking Sequence (unoptimized) 1 40 (0 10 /45 10 /90 10 / ) 2 30 (0 12 /45 6 /90 6 /-45 6 ) 3 24 (0 8 /45 6 /90 4 /-45 6 ) 4 20 (0 8 /45 4 /90 4 /-45 4 ) 5 16 (0 6 /45 4 /90 2 /-45 4 ) 6 8 (0 2 /45 2 /90 2 /-45 2 ) Generated Stacking Sequence Zone Stacking Sequence 1 [-45 2 / 90/ 45/ 0/ 45/ 0/ 90/ 0/ 45/-45/ 45 2 /- 45/ 0/ 90/ 0/ 90 2 /-45] s 2 [45/ 90/ 45/-45/ 45/-45/ 0/ 90/ 0 2 / 0 2 / 90/ 0/- 45] s 3 [90/ 45/ 90/-45/ 0 2 / 45/-45/ 0/ 45/ 0/-45] s 4 [90/-45/ 45/-45/ 0/ 90/ 0/ 45/ 0 2 ] s 5 [45/ 90/-45/ 45/ 0 2 /-45/ 0] s 6 [-45/ 45/ 90/ 0] s 22

23 Composite Skin Problem3 40 zones skin Zone Number Number of Plies Thickness Law & Initial Stacking Sequence (un-optimized) (0 50 /45 50 /90 50 / ) (0 76 /45 38 /90 38 / ) (0 62 /45 44 /90 26 / ) (0 84 /45 24 /90 34 / ) (0 54 /45 54 /90 54 / ) (0 82 /45 40 /90 40 / ) (0 66 /45 46 /90 30 / ) (0 92 /45 28 /90 36 / ) (0 54 /45 54 /90 54 / ) (0 80 /45 40 /90 40 / ) (0 68 /45 48 /90 30 / ) (0 90 /45 26 /90 36 / ) (0 40 /45 40 /90 40 / ) (0 74 /45 36 /90 36 / ) (0 78 /45 38 /90 38 / ) (0 44 /45 44 /90 44 / ) (0 42 /45 42 /90 42 / ) (0 36 /45 36 /90 36 / ) (0 68 /45 20 /90 28 / ) (0 70 /45 20 /90 28 / ) (0 24 /45 24 /90 24 / ) (0 48 /45 14 /90 20 / ) (0 52 /45 16 /90 20 / ) (0 42 /45 20 /90 20 / ) (0 52 /45 16 /90 20 / ) (0 40 /45 20 /90 20 / ) (0 20 /45 20 /90 20 / ) (0 46 /45 12 /90 20 / ) (0 36 /45 18 /90 18 / ) (0 48 /45 14 /90 20 / ) (0 34 /45 16 /90 16 / ) (0 4 /45 4 /90 4 /-45 4 ) (0 14 /45 4 /90 6 /-45 4 ) (0 12 /45 6 /90 6 /-45 6 ) (0 18 /45 4 /90 8 /-45 6 ) (0 14 /45 6 /90 6 /-45 6 ) (0 6 /45 2 /90 4 /-45 2 ) (0 4 /45 2 /90 2 /-45 2 ) (0 6 /45 2 /90 2 /-45 2 ) (0 8 /45 2 /90 4 /-45 2 ) 40 8 (0 2 /45 2 /90 2 /-45 2 ) 23

24 Composite Skin Problem 3-Results The final stacking sequence obtained for few zones are shown in below Table. Zone Stacking Sequence 1 [45/ 90/-45 2 / 0/ 90/ 0 4 / 90/ 45/ 0/ 45/ 0/ 45/ 0/ 90/ 0/ 90/ 45/-45/ 0/ 90/ 45/ 0/ 90/ 45/-45/ 0/ 45/ 0/ 45/ 0/ 45/ 0/ 45/-45/ 0/ 90/ 45/ 0/ 90/ 45/-45/ 45/ 0/ 90/ 0/ 45/-45/ 0/ 90/ 45/ 0/ 45/ 0/ 45/-45/ 45/ 0/ 90/ 0/ 90/ 45/-45/ 45/ 45/-45/ 0/ 45/-45/ 45/-45/ 45/-45/ 90/-45 2 / 90 4 /-45/-45/ 90/-45 3 / 90/-45 3 / 90 2 /-45/ 90/-45/ 90 2 ] s 2 [90 2 /-45/ 45 2 /0/ 45/-45/ 45/-45/ 45/-45/ 0/ 90/ 0/ 90/ 0/ 90/ 45/-45/ 0/ 45/-45/ 0/ 90/ 0 2 / 90/ 45/ 0/ 45/-45/ 45/- 45/ 45 4 / 0/ 45/-45/ 45/ 0/ 90/ 0 2 / 45/-45/ 45/-45/ 0/-45 2 / 0/ 90/ 0 2 / 90/-45/-45/ 0 2 / 90/ 0 4 / 90/ 0 4 / 90/ 0/-45/ 0/ 90/ 0/ 90/ 0/ 90/ 0/ 90/ 0 4 / 90/ 0/-45 3 / 90] s 3 [90/-45/ 90/ 90/-45/ 0/ 45/ 0/ 90/ 45/-45/ 45 2 /-45/ 0/ 90/ 0/ 90/ 45/-45/ 0/ 45/-45/ 0/ 45/-45/ 0/ 45/ 0/ 45/-45/ 0/ 45/ 0/ 90/ 45/ 0/ 90/ 45/-45/ 0/ 90/ 45/-45/ 45 2 / 0/ 90/ 45/ 0/ 45/-45/ 45/ 0/ 90/ 0/ 45/-45/ 45 3 /-45/ 90/ 0 3 /- 45/ 0/ 90/ 0 3 / 0/-45/ 0/-45/ 0 3 /-45/ 0/-45 3 / 0/-45 2 / 0 2 ] s 6 [-45 2 / 45/ 0/ 90/ 45/-45/ 0/ 90 2 /-45/ 45 4 / 0/ 45/ 0/ 45/ 0/ 45/ 0/ 90/ 45/-45/ 0/ 45/ 0/ 45/-45/ 0/ 45/ 0/ 90/ 0/ 90/ 45/ 0/ 45/-45/ 45 3 /-45/ 45/-45/ 45/-45/ 90/ 0/ 90/-45/ 0/ 90/ 0/ 90/ 0/-45 2 / 0 3 / 90/ 0/ 90/ 0/ 90/ 0/-45/ 0/ 90/-45/ 0/ 90/ 0/ 90/ 0 3 / 90/ 0/ 90/ 0/-45/ 0/ 90/ 0 2 / 90/-45/ 0 2 /-45 2 / 0 4 /-45/ 0 2 ] s Performance benefit CPU Alone CPU+GPU Performance benefit (Seconds) (Seconds) times 24

25 Conclusions A genetic algorithm based stacking sequence generation approach has been presented which can be used to solve large scale composite skin generation problems in commercial aircraft industry. The approach is scalable and has been successfully demonstrated to solve the large scale stacking sequence generation problems. Three important composite skin stacking sequence generation problems have been solved using the current approach. All the stacking sequence rules are satisfied in the final results. Results demonstrate that use of GPGPU results in speed-up of up to 8 times (in stacking sequence generation domain) compared to computation using only CPU. Future work Further investigation needs to be done on how the inter zonal harmonization can be brought into the genetic algorithm based generation framework. The ply materials can be more than one and the orientations can be more than four, which when formulated in to model will increase complexity. 25

26 Discussions 26

27 THANK YOU The contents of this document are proprietary and confidential to Infosys Limited and may not be disclosed in whole or in part at any time, to any third party without the prior written consent of Infosys Limited Infosys Limited. All rights reserved. Copyright in the whole and any part of this document belongs to Infosys Limited. This work may not be used, sold, transferred, adapted, abridged, copied or reproduced in whole or in part, in any manner or form, or in any media, without the prior written consent of Infosys Limited.

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

HIGH PERFORMANCE CTC TRAINING FOR END-TO-END SPEECH RECOGNITION ON GPU

HIGH PERFORMANCE CTC TRAINING FOR END-TO-END SPEECH RECOGNITION ON GPU April 4-7, 2016 Silicon Valley HIGH PERFORMANCE CTC TRAINING FOR END-TO-END SPEECH RECOGNITION ON GPU Minmin Sun, NVIDIA minmins@nvidia.com April 5th Brief Introduction of CTC AGENDA Alpha/Beta Matrix

More information

Department of Mathematics, Graphic Era University, Dehradun, Uttarakhand, India

Department of Mathematics, Graphic Era University, Dehradun, Uttarakhand, India Genetic Algorithm for Minimization of Total Cost Including Customer s Waiting Cost and Machine Setup Cost for Sequence Dependent Jobs on a Single Processor Neelam Tyagi #1, Mehdi Abedi *2 Ram Gopal Varshney

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS

GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS GENETIC ALGORITHM FOR CELL DESIGN UNDER SINGLE AND MULTIPLE PERIODS A genetic algorithm is a random search technique for global optimisation in a complex search space. It was originally inspired by an

More information

arxiv: v1 [hep-lat] 7 Oct 2010

arxiv: v1 [hep-lat] 7 Oct 2010 arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University

Stochastic Search: Part 2. Genetic Algorithms. Vincent A. Cicirello. Robotics Institute. Carnegie Mellon University Stochastic Search: Part 2 Genetic Algorithms Vincent A. Cicirello Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 cicirello@ri.cmu.edu 1 The Genetic Algorithm (GA)

More information

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Khramtsov D.P., Nekrasov D.A., Pokusaev B.G. Department of Thermodynamics, Thermal Engineering and Energy Saving Technologies,

More information

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ CS-206 Concurrency Lecture 13 Wrap Up Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Created by Nooshin Mirzadeh, Georgios Psaropoulos and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

CRYPTOGRAPHIC COMPUTING

CRYPTOGRAPHIC COMPUTING CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,

More information

Practical Combustion Kinetics with CUDA

Practical Combustion Kinetics with CUDA Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge González-Domínguez Parallel and Distributed Architectures Group Johannes Gutenberg University of Mainz, Germany j.gonzalez@uni-mainz.de

More information

how should the GA proceed?

how should the GA proceed? how should the GA proceed? string fitness 10111 10 01000 5 11010 3 00011 20 which new string would be better than any of the above? (the GA does not know the mapping between strings and fitness values!)

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

Parallel Longest Common Subsequence using Graphics Hardware

Parallel Longest Common Subsequence using Graphics Hardware Parallel Longest Common Subsequence using Graphics Hardware John Kloetzli rian Strege Jonathan Decker Dr. Marc Olano Presented by: rian Strege 1 Overview Introduction Problem Statement ackground and Related

More information

Chapter 8: Introduction to Evolutionary Computation

Chapter 8: Introduction to Evolutionary Computation Computational Intelligence: Second Edition Contents Some Theories about Evolution Evolution is an optimization process: the aim is to improve the ability of an organism to survive in dynamically changing

More information

Tips Geared Towards R. Adam J. Suarez. Arpil 10, 2015

Tips Geared Towards R. Adam J. Suarez. Arpil 10, 2015 Tips Geared Towards R Departments of Statistics North Carolina State University Arpil 10, 2015 1 / 30 Advantages of R As an interpretive and interactive language, developing an algorithm in R can be done

More information

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose 上海超级计算中心 Shanghai Supercomputer Center Lei Xu Shanghai Supercomputer Center 03/26/2014 @GTC, San Jose Overview Introduction Fundamentals of the FDTD method Implementation of 3D UPML-FDTD algorithm on GPU

More information

arxiv: v1 [cs.ne] 29 Jul 2014

arxiv: v1 [cs.ne] 29 Jul 2014 A CUDA-Based Real Parameter Optimization Benchmark Ke Ding and Ying Tan School of Electronics Engineering and Computer Science, Peking University arxiv:1407.7737v1 [cs.ne] 29 Jul 2014 Abstract. Benchmarking

More information

Scalable and Power-Efficient Data Mining Kernels

Scalable and Power-Efficient Data Mining Kernels Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the

More information

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS)

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS) Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS) Objectives: By the end of the lesson you should be able to: Describe the 2 types of evolution Describe the 5 ways

More information

A heuristic algorithm for the Aircraft Landing Problem

A heuristic algorithm for the Aircraft Landing Problem 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 A heuristic algorithm for the Aircraft Landing Problem Amir Salehipour

More information

Sparse LU Factorization on GPUs for Accelerating SPICE Simulation

Sparse LU Factorization on GPUs for Accelerating SPICE Simulation Nano-scale Integrated Circuit and System (NICS) Laboratory Sparse LU Factorization on GPUs for Accelerating SPICE Simulation Xiaoming Chen PhD Candidate Department of Electronic Engineering Tsinghua University,

More information

Matt Heavner CSE710 Fall 2009

Matt Heavner CSE710 Fall 2009 Matt Heavner mheavner@buffalo.edu CSE710 Fall 2009 Problem Statement: Given a set of cities and corresponding locations, what is the shortest closed circuit that visits all cities without loops?? Fitness

More information

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it DEIS Alma Mater Studiorum Università di Bologna Evolutionary computation p. 1 Evolutionary Computation Evolutionary computation p. 2 Evolutionary

More information

Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method

Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method NUCLEAR SCIENCE AND TECHNIQUES 25, 0501 (14) Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method XU Qi ( 徐琪 ), 1, YU Gang-Lin ( 余纲林 ), 1 WANG Kan ( 王侃 ),

More information

A CUDA Solver for Helmholtz Equation

A CUDA Solver for Helmholtz Equation Journal of Computational Information Systems 11: 24 (2015) 7805 7812 Available at http://www.jofcis.com A CUDA Solver for Helmholtz Equation Mingming REN 1,2,, Xiaoguang LIU 1,2, Gang WANG 1,2 1 College

More information

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application Administrivia 1. markem/cs333/ 2. Staff 3. Prerequisites 4. Grading Course Objectives 1. Theory and application 2. Benefits 3. Labs TAs Overview 1. What is a computer system? CPU PC ALU System bus Memory

More information

GPU Acceleration of BCP Procedure for SAT Algorithms

GPU Acceleration of BCP Procedure for SAT Algorithms GPU Acceleration of BCP Procedure for SAT Algorithms Hironori Fujii 1 and Noriyuki Fujimoto 1 1 Graduate School of Science Osaka Prefecture University 1-1 Gakuencho, Nakaku, Sakai, Osaka 599-8531, Japan

More information

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric

More information

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)

More information

Research into GPU accelerated pattern matching for applications in computer security

Research into GPU accelerated pattern matching for applications in computer security Research into GPU accelerated pattern matching for applications in computer security November 4, 2009 Alexander Gee age19@student.canterbury.ac.nz Department of Computer Science and Software Engineering

More information

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications Christopher Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, Wen-Mei W. Hwu University of Illinois at Urbana-Champaign

More information

Crossover Techniques in GAs

Crossover Techniques in GAs Crossover Techniques in GAs Debasis Samanta Indian Institute of Technology Kharagpur dsamanta@iitkgp.ac.in 16.03.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 16.03.2018 1 / 1 Important

More information

S.Nagendra, D.Jestin, Z.Gurdal, R.T.Haftka and L.T.Watson Computers & Structures, Vol. 58, No. 3, pp , 1996.

S.Nagendra, D.Jestin, Z.Gurdal, R.T.Haftka and L.T.Watson Computers & Structures, Vol. 58, No. 3, pp , 1996. S.Nagendra, D.Jestin, Z.Gurdal, R.T.Haftka and L.T.Watson Computers & Structures, Vol. 58, No. 3, pp. 543-555, 1996. Presented by Vignesh Solai Rameshbabu Introduction Simple programming technique which

More information

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining 13. Meta-Algorithms for Classification Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13.

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

11 Parallel programming models

11 Parallel programming models 237 // Program Design 10.3 Assessing parallel programs 11 Parallel programming models Many different models for expressing parallelism in programming languages Actor model Erlang Scala Coordination languages

More information

An artificial chemical reaction optimization algorithm for. multiple-choice; knapsack problem.

An artificial chemical reaction optimization algorithm for. multiple-choice; knapsack problem. An artificial chemical reaction optimization algorithm for multiple-choice knapsack problem Tung Khac Truong 1,2, Kenli Li 1, Yuming Xu 1, Aijia Ouyang 1, and Xiaoyong Tang 1 1 College of Information Science

More information

Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography

Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography Background Parallel Sieve Processing on Vector Processor and GPU Yasunori Ushiro (Earth Simulator Center) Yoshinari Fukui (Earth Simulator Center) Hidehiko Hasegawa (Univ. of Tsukuba) () RSA Cryptography

More information

Parallel Genetic Algorithms

Parallel Genetic Algorithms Parallel Genetic Algorithms for the Calibration of Financial Models Riccardo Gismondi June 13, 2008 High Performance Computing in Finance and Insurance Research Institute for Computational Methods Vienna

More information

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Accelerating Model Reduction of Large Linear Systems with Graphics Processors Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex

More information

Fundamentals of Genetic Algorithms

Fundamentals of Genetic Algorithms Fundamentals of Genetic Algorithms : AI Course Lecture 39 40, notes, slides www.myreaders.info/, RC Chakraborty, e-mail rcchak@gmail.com, June 01, 2010 www.myreaders.info/html/artificial_intelligence.html

More information

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD XVIII International Conference on Water Resources CMWR 2010 J. Carrera (Ed) c CIMNE, Barcelona, 2010 COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD James.E. McClure, Jan F. Prins

More information

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Christopher P. Stone, Ph.D. Computational Science and Engineering, LLC Kyle Niemeyer, Ph.D. Oregon State University 2 Outline

More information

GPU Based Parallel Ising Computing for Combinatorial Optimization Problems in VLSI Physical Design

GPU Based Parallel Ising Computing for Combinatorial Optimization Problems in VLSI Physical Design 1 GPU Based Parallel Ising Computing for Combinatorial Optimization Problems in VLSI Physical Design arxiv:1807.10750v1 [physics.comp-ph] 27 Jul 2018 Chase Cook Student Member, IEEE, Hengyang Zhao Student

More information

POLITECNICO DI MILANO DATA PARALLEL OPTIMIZATIONS ON GPU ARCHITECTURES FOR MOLECULAR DYNAMIC SIMULATIONS

POLITECNICO DI MILANO DATA PARALLEL OPTIMIZATIONS ON GPU ARCHITECTURES FOR MOLECULAR DYNAMIC SIMULATIONS POLITECNICO DI MILANO Facoltà di Ingegneria dell Informazione Corso di Laurea in Ingegneria Informatica DATA PARALLEL OPTIMIZATIONS ON GPU ARCHITECTURES FOR MOLECULAR DYNAMIC SIMULATIONS Relatore: Prof.

More information

Lecture 15: Genetic Algorithms

Lecture 15: Genetic Algorithms Lecture 15: Genetic Algorithms Dr Roman V Belavkin BIS3226 Contents 1 Combinatorial Problems 1 2 Natural Selection 2 3 Genetic Algorithms 3 31 Individuals and Population 3 32 Fitness Functions 3 33 Encoding

More information

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain

More information

Parallel Transposition of Sparse Data Structures

Parallel Transposition of Sparse Data Structures Parallel Transposition of Sparse Data Structures Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng Department of Computer Science, Virginia Tech Niels Bohr Institute, University of Copenhagen Scientific Computing

More information

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal

More information

ACCELERATED LEARNING OF GAUSSIAN PROCESS MODELS

ACCELERATED LEARNING OF GAUSSIAN PROCESS MODELS ACCELERATED LEARNING OF GAUSSIAN PROCESS MODELS Bojan Musizza, Dejan Petelin, Juš Kocijan, Jožef Stefan Institute Jamova 39, Ljubljana, Slovenia University of Nova Gorica Vipavska 3, Nova Gorica, Slovenia

More information

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 12, December 2012 pp. 8249 8256 OPTIMIZED RESOURCE IN SATELLITE NETWORK

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

Genetic Algorithm. Outline

Genetic Algorithm. Outline Genetic Algorithm 056: 166 Production Systems Shital Shah SPRING 2004 Outline Genetic Algorithm (GA) Applications Search space Step-by-step GA Mechanism Examples GA performance Other GA examples 1 Genetic

More information

Sparse solver 64 bit and out-of-core addition

Sparse solver 64 bit and out-of-core addition Sparse solver 64 bit and out-of-core addition Prepared By: Richard Link Brian Yuen Martec Limited 1888 Brunswick Street, Suite 400 Halifax, Nova Scotia B3J 3J8 PWGSC Contract Number: W7707-145679 Contract

More information

Havrda and Charvat Entropy Based Genetic Algorithm for Traveling Salesman Problem

Havrda and Charvat Entropy Based Genetic Algorithm for Traveling Salesman Problem 3 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.5, May 008 Havrda and Charvat Entropy Based Genetic Algorithm for Traveling Salesman Problem Baljit Singh, Arjan Singh

More information

High-performance processing and development with Madagascar. July 24, 2010 Madagascar development team

High-performance processing and development with Madagascar. July 24, 2010 Madagascar development team High-performance processing and development with Madagascar July 24, 2010 Madagascar development team Outline 1 HPC terminology and frameworks 2 Utilizing data parallelism 3 HPC development with Madagascar

More information

Acceleration of WRF on the GPU

Acceleration of WRF on the GPU Acceleration of WRF on the GPU Daniel Abdi, Sam Elliott, Iman Gohari Don Berchoff, Gene Pache, John Manobianco TempoQuest 1434 Spruce Street Boulder, CO 80302 720 726 9032 TempoQuest.com THE WORLD S FASTEST

More information

Improving Simulations of Spiking Neural P Systems in NVIDIA CUDA GPUs: CuSNP

Improving Simulations of Spiking Neural P Systems in NVIDIA CUDA GPUs: CuSNP Improving Simulations of Spiking Neural P Systems in NVIDIA CUDA GPUs: CuSNP 1 Jym Paul Carandang, 1 John Matthew B. Villaflores, 1 Francis George C. Cabarle, 1 Henry N. Adorna, 2 Miguel Ángel Martínez-del-Amor

More information

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing

More information

Interplanetary Trajectory Optimization using a Genetic Algorithm

Interplanetary Trajectory Optimization using a Genetic Algorithm Interplanetary Trajectory Optimization using a Genetic Algorithm Abby Weeks Aerospace Engineering Dept Pennsylvania State University State College, PA 16801 Abstract Minimizing the cost of a space mission

More information

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.

More information

Genetic Algorithms & Modeling

Genetic Algorithms & Modeling Genetic Algorithms & Modeling : Soft Computing Course Lecture 37 40, notes, slides www.myreaders.info/, RC Chakraborty, e-mail rcchak@gmail.com, Aug. 10, 2010 http://www.myreaders.info/html/soft_computing.html

More information

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk

More information

A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters

A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters Abal-Kassim Cheik Ahamed and Frédéric Magoulès Introduction By giving another way to see beneath the Earth, gravimetry

More information

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen Verbundprojekt ELPA-AEO http://elpa-aeo.mpcdf.mpg.de Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen BMBF Projekt 01IH15001 Feb 2016 - Jan 2019 7. HPC-Statustagung,

More information

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia)

Evolutionary Computation. DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) Evolutionary Computation DEIS-Cesena Alma Mater Studiorum Università di Bologna Cesena (Italia) andrea.roli@unibo.it Evolutionary Computation Inspiring principle: theory of natural selection Species face

More information

Level-3 BLAS on a GPU

Level-3 BLAS on a GPU Level-3 BLAS on a GPU Picking the Low Hanging Fruit Francisco Igual 1 Gregorio Quintana-Ortí 1 Robert A. van de Geijn 2 1 Departamento de Ingeniería y Ciencia de los Computadores. University Jaume I. Castellón

More information

Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs

Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs 1 School of Information Engineering, China University of Geosciences (Beijing) Beijing, 100083, China E-mail: Yaolk1119@icloud.com Ailan

More information

Multicore Parallelization of Determinant Quantum Monte Carlo Simulations

Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Andrés Tomás, Che-Rung Lee, Zhaojun Bai, Richard Scalettar UC Davis SIAM Conference on Computation Science & Engineering Reno, March

More information

PSEUDORANDOM numbers are very important in practice

PSEUDORANDOM numbers are very important in practice Proceedings of the Federated Conference on Computer Science and Information Systems pp 571 578 ISBN 978-83-681-51-4 Parallel GPU-accelerated Recursion-based Generators of Pseudorandom Numbers Przemysław

More information

Optimal Operation of Large Power System by GA Method

Optimal Operation of Large Power System by GA Method Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) (1): 1-7 Scholarlink Research Institute Journals, 01 (ISSN: 11-7016) jeteas.scholarlinkresearch.org Journal of Emerging Trends in

More information

Minimization of Energy Loss using Integrated Evolutionary Approaches

Minimization of Energy Loss using Integrated Evolutionary Approaches Minimization of Energy Loss using Integrated Evolutionary Approaches Attia A. El-Fergany, Member, IEEE, Mahdi El-Arini, Senior Member, IEEE Paper Number: 1569614661 Presentation's Outline Aim of this work,

More information

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems

More information

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance

More information

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry and Eugene DePrince Argonne National Laboratory (LCF and CNM) (Eugene moved to Georgia Tech last week)

More information

Multivariate density estimation and its applications

Multivariate density estimation and its applications Multivariate density estimation and its applications Wing Hung Wong Stanford University June 2014, Madison, Wisconsin. Conference in honor of the 80 th birthday of Professor Grace Wahba Lessons I learned

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles

More information

Computation of Large Sparse Aggregated Areas for Analytic Database Queries

Computation of Large Sparse Aggregated Areas for Analytic Database Queries Computation of Large Sparse Aggregated Areas for Analytic Database Queries Steffen Wittmer Tobias Lauer Jedox AG Collaborators: Zurab Khadikov Alexander Haberstroh Peter Strohm Business Intelligence and

More information

Dr. Andrea Bocci. Using GPUs to Accelerate Online Event Reconstruction. at the Large Hadron Collider. Applied Physicist

Dr. Andrea Bocci. Using GPUs to Accelerate Online Event Reconstruction. at the Large Hadron Collider. Applied Physicist Using GPUs to Accelerate Online Event Reconstruction at the Large Hadron Collider Dr. Andrea Bocci Applied Physicist On behalf of the CMS Collaboration Discover CERN Inside the Large Hadron Collider at

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

RELIABILITY ANALYSIS IN BOLTED COMPOSITE JOINTS WITH SHIMMING MATERIAL

RELIABILITY ANALYSIS IN BOLTED COMPOSITE JOINTS WITH SHIMMING MATERIAL 25 TH INTERNATIONAL CONGRESS OF THE AERONAUTICAL SCIENCES RELIABILITY ANALYSIS IN BOLTED COMPOSITE JOINTS WITH SHIMMING MATERIAL P. Caracciolo, G. Kuhlmann AIRBUS-Germany e-mail: paola.caracciolo@airbus.com

More information

QUESTION BANK Composite Materials

QUESTION BANK Composite Materials QUESTION BANK Composite Materials 1. Define composite material. 2. What is the need for composite material? 3. Mention important characterits of composite material 4. Give examples for fiber material 5.

More information

Practical Free-Start Collision Attacks on 76-step SHA-1

Practical Free-Start Collision Attacks on 76-step SHA-1 Practical Free-Start Collision Attacks on 76-step SHA-1 Inria and École polytechnique, France Nanyang Technological University, Singapore Joint work with Thomas Peyrin and Marc Stevens CWI, Amsterdam 2015

More information

Genetic Algorithms: Basic Principles and Applications

Genetic Algorithms: Basic Principles and Applications Genetic Algorithms: Basic Principles and Applications C. A. MURTHY MACHINE INTELLIGENCE UNIT INDIAN STATISTICAL INSTITUTE 203, B.T.ROAD KOLKATA-700108 e-mail: murthy@isical.ac.in Genetic algorithms (GAs)

More information

A GA Mechanism for Optimizing the Design of attribute-double-sampling-plan

A GA Mechanism for Optimizing the Design of attribute-double-sampling-plan A GA Mechanism for Optimizing the Design of attribute-double-sampling-plan Tao-ming Cheng *, Yen-liang Chen Department of Construction Engineering, Chaoyang University of Technology, Taiwan, R.O.C. Abstract

More information

Intelligens Számítási Módszerek Genetikus algoritmusok, gradiens mentes optimálási módszerek

Intelligens Számítási Módszerek Genetikus algoritmusok, gradiens mentes optimálási módszerek Intelligens Számítási Módszerek Genetikus algoritmusok, gradiens mentes optimálási módszerek 2005/2006. tanév, II. félév Dr. Kovács Szilveszter E-mail: szkovacs@iit.uni-miskolc.hu Informatikai Intézet

More information

CAEFEM v9.5 Information

CAEFEM v9.5 Information CAEFEM v9.5 Information Concurrent Analysis Corporation, 50 Via Ricardo, Thousand Oaks, CA 91320 USA Tel. (805) 375 1060, Fax (805) 375 1061 email: info@caefem.com or support@caefem.com Web: http://www.caefem.com

More information

Delayed and Higher-Order Transfer Entropy

Delayed and Higher-Order Transfer Entropy Delayed and Higher-Order Transfer Entropy Michael Hansen (April 23, 2011) Background Transfer entropy (TE) is an information-theoretic measure of directed information flow introduced by Thomas Schreiber

More information

Evolutionary computation

Evolutionary computation Evolutionary computation Andrea Roli andrea.roli@unibo.it Dept. of Computer Science and Engineering (DISI) Campus of Cesena Alma Mater Studiorum Università di Bologna Outline 1 Basic principles 2 Genetic

More information