Beam dynamics calculation

Size: px

Start display at page:

Download "Beam dynamics calculation"

Ethan Townsend
5 years ago
Views:

1 September 6 Beam dynamics calculation S.B. Vorozhtsov, Е.Е. Perepelkin and V.L. Smirnov Dubna, JINR

2 Outline Problem formulation Numerical methods OpenMP and CUDA realization Results CBDA: Cyclotron Beam Dynamic Analysis code

3 Computer model of the cyclotron Injection line Inflector Dee ESD Magnet sectors

4 Inflector

5 Central region Courtesy A.S. Vorozhtsov

6 Regions of the field maps Inflector Electric field Axial channel Magnet field G1 Magnet field

7 Motion equation r r, p, t, i= 1K N ( ) i i d r r p i = F i, i = 1K N dt d r r r r r r r r mi ( γ ivi ) = qi Eext ( i, t) + Es ( i, t) + vi, Bext ( i ) dt 2 vi γi = 1 1 βi, βi = c r r( 0) r r( 0) r r( 0) r r( 0) i = i, vi = vi, или i =, t ti t ti t 0 i vi = v = = = t= 0 i r 1 i N, i V ( )

8 Space charge effect r ρ r r 1 r dive =, rotb = μ J + E s s 0 s 2 s ε 0 c t r r r 1 rotes = Bs, divbs = 0, = c t με PIC метоdays r E s = ϕ ( p) ρ Δ ϕ ( p) =, p Ω ε 0 ϕ ϕ = ϕd, = ψ N, Γ Γ D Γ N =Γ D n ΓN PP метоdays r N r 1 q j r r E ( ) = r r 3 ( ), i= 1K N s i i j 4πε 0 j i ri rj r 1 q r r r r E j i =, < R j ( ) ( ) s 3 i j i j 4πε 0 R

9 Space charge effect ( p) ρ Δ ϕ ( p) =, p Ω ε 0 ϕ Γ = 0 ρ z y x ( nmk) ρ( xi yj zs) ϕ ( i, j, s) ρ x y z obtain from the particle distribution N 1 N 1 N 1 8 πni πmj πks,, =,, sin sin sin NNN x y z s= 1 j= 1 i= 1 N x N y Nz πn πm πk = + + L x L y Lz ( nmk,, ) ρ( nmk,, ) FFT use for Furier set N 1 z 1 N y Nx 1 πni πmj πks ϕ( xi, yj, zs) = ϕ( n, m, k) sin sin sin k= 1 m= 1 n= 1 N x N y Nz 1

10 Bunch area Lz Lx Lz Ly Lx Lx Ly Lz Ly Mesh Nx Ny Nz Lx Ly Lz Mesh steps hx =, hy =, hz = Nx Ny Nz

11 Charge density calculation Cell 7 Cell 8 Cell 6 Cell 3 Node Cell 5 Cell 2 Cell 1

12 Particle losses t n+1 D B A C If point D belongs the ABC triangle then Cross condition S + S + S = S ΔADC ΔADB ΔCDB ΔABC t n S + S + S < S + ε ΔADC ΔADB ΔCDB ΔABC Δ where ε Δ maximal deviation from surface

13 Central region optimization «Ground» φ RF = 15 φ RF = 13 without posts F = ZU RF -W GAP Dee φ RF = 28 φ RF = 10 with posts

14 Choice optimal configuration S0 S1 S2 S3 S4

15 Accelerating field distribution

16 Very time consuming problem About 5 different variants - minimum Many ions species - accelerated Very complicated structure Multi macro particle simulation for SC dominated beams One run requires ~ several days of computer time

17 Numerical methods Runge-Kutta 4 th order ( motion equations ) FFT for Poisson equation ( space charge effect ) Beam losses ( «ray tracing» algorithm )

18 Open Multi-Processing OpenMP

19 Spiral inflector

20 Beam parameters at the inflector entrance

21 Beam parameters at the inflector exit Blue points PIC by FFT ( mesh: 2 5 x 2 5 x 2 5 ) Red points PP

22 Calculation time Method Without OpenMP With OpenMP Computer platform PP 4 h. 53 min. 2 h. 34 min. 4 h. 38min 1 h. 25 min. AMD Turion 64 2, 1.60 GHz Intel Core2 Quad 2.4 GHz PIC ~11 min. ~6 min. 2 5 x 2 5 x min. ~2 min. AMD Turion 64 2, 1.60 GHz Intel Core2 Quad 2.4 GHz 10,000 particles No geometry losses

23 CUDA Compute Unified Device Architecture

24 CUDA kernel-functions Track ( field maps, particle coordinate and velocity ) Losses (setup geometry, particle coordinate ) Rho ( particle coordinate ) FFT ( charge density function ) PoissonSolver ( Furies coefficients ) E_SC ( electric field potentials )

25 global void Track ( ) Function with many parameters. Use variable type constant : device constant float d_float[200]; device constant int d_int[80]; Particle number corresponds: int n = threadidx.x+blockidx.x*blockdim.x; Number of if, goto, for should be dicreased Linear interpolation field maps texture < float, 2, cudareadmodeelementtype > Ex_TexRef; Ex_TexRef.filterMode = cudafiltermodelinear; ( ) ( ) ( ) P = 1 α P1+ αp2 1 β + 1 α P3+ αp4 β 0 α, β 1 v α 1-α P1 P3 P2 P4 β 1-β u

26 global void Losses ( ) Geometry structure consist of triangles. Block threads copies the triangle nodes from global to shared memory. Threads synchronize after copy operation syncthreads() Particle number corresponds to thread number: int n = threadidx.x+blockidx.x*blockdim.x; Check particles and triangles match

global void Rho Calculate charge impact in nodes of the mesh from particle with number n = threadidx.x + blockidx.x*blockdim.x Particle gives impact to 8 nodes.

27 global void Rho Calculate charge impact in nodes of the mesh from particle with number n = threadidx.x + blockidx.x*blockdim.x Particle gives impact to 8 nodes. Therefore, each thread store 16 values: 8 numbers of node and 8 values of charge density. For each node summarize all impacts of charge density. Cell 7 Cell 8 Cell 6 Cell 3 Node Cell 5 Cell 2 Cell 1

28 global FFT ( ) Used real FFT for sin(πn/n) basis function 3D transform consist of three 1D FFT for each axies: X, Y, Z int n = threadidx.x+blockidx.x*blockdim.x; k=(int)(n/(ny+1)); j=n-k*(ny+1); m=j*(nx+1)+k*(nx+1)*(ny+1); FFT_X[i+1]=Rho[i+m]; n = j + k*(ny+1) NY NZ Rho function data ( 3D massive )

29 global PoissonSolver ( ) Thread number int n = threadidx.x+blockidx.x*blockdim.x; Each thread calculate Furier coefficient PhiF of potential function Phi PhiF ind(i,j,k) = -RhoF ind(i,j,k) / ( kx i2 + ky j2 + kz k2 ) in the node: ind(i,j,k)=i+j*(nx+1)+k*(nx+1)*(ny+1), where k=(int)(n/(nx+1)*(ny+1)); j=(int)(n-k*(nx+1)*(ny+1))/(nx+1); i=n-j*(nx+1)-k*(nx+1)*(ny+1); RhoF Furier coefficients of charge density function Rho.

30 global E_SC ( ) Electric field calculation at mesh node int n = threadidx.x+blockidx.x*blockdim.x+st_ind φ n + ( NX + 1 )( NY + 1 ) φ φ n n + ( NX + 1 ) φ n - 1 φ n - ( NX + 1 ) E E E x y z ϕ = ϕ = ϕ = ϕ n+ 1 n 1 2h x ϕ ( 1) ( 1) n+ Nx+ n Nx+ 2h y ϕ ( 1)( 1) ( 1)( 1) n+ Nx+ Ny+ n Nx+ Ny+ 2h z φ n + 1 φ n - ( NX + 1 )( NY + 1 )

31 Particle losses

32 Bunch acceleration

33 GeForce 8800GTX CUDA kernel-function* Time, [ms] CPU ** GPU Rate, [x] Track Losses Rho Poisson/FFT E_SC Total *Mesh size: 2 5 x 2 5 x 2 5. Number of particles: 100,000 triangles: 2054 **CPU 2.4 GHz

34 GeForce 8800GTX Number of particles Time calculation CPU * GPU Rate, [x] 1,000 3 min. 19 sec. 12 sec , min. 14 sec. 42 sec ,000 5h.41 min. 6 min. 56 1,000,000 2 days 8 h. 53 min. 1h. 60 *CPU 2.4 GHz

35 Tesla C1060 Number of particles Time calculation CPU GPU 2.5GHz C 1060 Rate, [x] 1,000 3 min. 12 sec. 11 sec , min. 24 sec. 27 sec ,000 5 h. 14 min. 31 sec. 3 min. 34 sec. 88 1,000,000 2 days 4 h. 25 min. 34 min. 29 sec. 91 without space charge effect

36 Tesla C1060 Number of particles Time calculation CPU GPU 2.5 GHz C 1060 Rate, [x] 10, min. 36 sec. 44 sec ,000 5 h. 28 min. 12 sec. 5 min. 4 sec. 65 1,000,000 2 days 8 h. 27 min. 50 min. 17 sec. 67 with space charge effect

37 Space charge effect I ~ 0 Losses 24% I = 4 mа Losses 94%

38 Conclusions Inexpensive technology. Increased performance times gives a chance to perform the whole cyclotron computer modeling. Very careful programming is required that is the price for the obtained high performance of calculations. Application to other complicated simulation, related to the accelerator physics (beam halo etc), is possible.

39 Education and Research Center «Applied Parallel Compute»

Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice

Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice Michal Borovský Department of Theoretical Physics and Astrophysics, University of P. J. Šafárik in Košice,