arxiv: v2 [astro-ph.sr] 13 Aug PDF Free Download

Astronomy & Astrophysics manuscript no. ms c ESO 2013 August 14, 2013 PORTA: A three-dimensiona mutieve radiative transfer code for modeing the intensity and poarization of spectra ines with massivey parae computers Jiří Štěpán 1 and Javier Trujio Bueno 2,3,4 1 Astronomica Institute ASCR, v.v.i., Ondřejov, Czech Repubic. e-mai: jiri.stepan@asu.cas.cz 2 Instituto de Astrofísica de Canarias, Vía Láctea s/n, E-38205 La Laguna, Tenerife, Spain. e-mai: jtb@iac.es 3 Departamento de Astrofísica, Universidad de La Laguna (ULL), E-38206 La Laguna, Tenerife, Spain 4 Consejo Superior de Investigaciones Científicas, Spain arxiv:1307.4217v2 [astro-ph.sr] 13 Aug 2013 Received XXXX; accepted XXXX ABSTRACT The interpretation of the intensity and poarization of the spectra ine radiation produced in the atmosphere of the Sun and of other stars requires soving a radiative transfer probem that can be very compex, especiay when the main interest ies in modeing the spectra ine poarization produced by scattering processes and the Hane and Zeeman effects. One of the difficuties is that the pasma of a stear atmosphere can be highy inhomogeneous and dynamic, which impies the need to sove the non-equiibrium probem of the generation and transfer of poarized radiation in reaistic three-dimensiona (3D) stear atmospheric modes. Here we present PORTA, an efficient mutieve radiative transfer code we have deveoped for the simuation of the spectra ine poarization caused by scattering processes and the Hane and Zeeman effects in 3D modes of stear atmospheres. The numerica method of soution is based on the non-inear mutigrid iterative method and on a nove short-characteristics forma sover of the Stokes-vector transfer equation which uses monotonic Bézier interpoation. Therefore, with PORTA the computing time needed to obtain at each spatia grid point the sef-consistent vaues of the atomic density matrix (which quantifies the excitation state of the atomic system) scaes ineary with the tota number of grid points. Another crucia feature of PORTA is its paraeization strategy, which aows us to speed up the numerica soution of compicated 3D probems by severa orders of magnitude with respect to sequentia radiative transfer approaches, given its exceent inear scaing with the number of avaiabe processors. The PORTA code can aso be convenienty appied to sove the simper 3D radiative transfer probem of unpoarized radiation in mutieve systems. Key words. ine: formation magnetic fieds methods: numerica poarization radiative transfer 1. Introduction This paper describes a computer program we have deveoped for soving, in three-dimensiona (3D) modes of stear atmospheres, the probem of the generation and transfer of spectra ine poarization taking into account anisotropic radiation pumping and the Hane and Zeeman effects in mutieve systems. The numerica method of soution is based on a highy convergent iterative method, whose convergence rate is insensitive to the grid size, and on an accurate short-characteristics forma sover of the Stokes-vector transfer equation that uses monotonic Bézier interpoation. A key feature of our mutieve code caed PORTA (POarized Radiative TrAnsfer) is its paraeization strategy, which aows us to speed up the numerica soution of compicated 3D probems by severa orders of magnitude with respect to sequentia radiative transfer approaches. The mutieve radiative transfer probem currenty soved by PORTA is the so-caed non-lte probem of the 2nd kind (Landi Deg Innocenti & Landofi (2004), hereafter LL04; see aso Trujio Bueno (2009)), where the phenomenon of scattering in a spectra ine is described as the tempora succession of statisticay-independent events of absorption and reemission (compete frequency redistribution, or CRD). This is a formidabe numerica probem that impies cacuating, at each spatia grid point of the (generay magnetized) 3D stear atmosphere mode under consideration, the vaues of the mutipoar components of the atomic density matrix corresponding to each atomic eve of tota anguar momentum J. These ρ K Q (J) eements, with K = 0,..., 2J and Q = K,..., K, quantify the overa popuation of each eve J (ρ 0 0 (J)), the popuation imbaances between its magnetic subeves (ρ0 K (J), with K > 0), and the quantum coherence between each pair of them (ρ K Q (J), with K > 0 and Q 0). The vaues of these density-matrix eements have to be consistent with the intensity, poarization, and symmetry properties of the incident radiation fied generated within the medium. Finding these density-matrix vaues requires soving jointy the radiative transfer (RT) equations for the Stokes parameters (I(ν, Ω) = (I, Q, U, V) T, with ν and Ω the frequency and direction of propagation of the radiation beam under consideration) and the statistica equiibrium equations (SEE) for the ρ K Q (J) eements. These ρk Q (J) eements, at each spatia grid point of the 3D atmospheric mode and for each eve J of the considered atomic mode, provide a compete description of the excitation of each eve J. As a resut, the radiative transfer coefficients (i.e., the emission vector and the propagation matrix of the Stokes-vector transfer equation) corresponding to each ine transition depend on the ρ K Q (J) vaues of the upper (J = J u) and ower (J = J ) ine eves. Once the sef-consistent ρ K Q (J) vaues are obtained at each point within the medium, PORTA soves the Stokes-vector transfer equation to obtain the emergent Stokes profies for any desired ine transition and ine of 1

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization sight. Obviousy, this ast step is computationay cheap compared with the time needed to obtain the sef-consistent soution for the ρ K Q (J) eements. When poarization phenomena are negected the ony nonzero density-matrix eement is ρ 0 0 (J) (proportiona to the overa popuation of each J eve) and the ony non-zero Stokes parameter is the specific intensity I(ν, Ω) for each of the radiative transitions in the mode atom under consideration. This non-lte probem of the 1st kind (e.g., Mihaas, 1978) is a particuar case of the above-mentioned non-lte probem of the 2nd kind. In other words, the 3D mutieve radiative transfer code described here can aso be appied to sove the standard non-lte mutieve probem on which much of today s quantitative stear spectroscopy is based. For this reason, PORTA provides options for spectropoarimetry and for spectroscopy (see Appendix B). An overview on 3D radiative transfer codes for unpoarized radiation can be found in Carsson (2008). Information on numerica methods for the transfer of spectra ine poarization can be found in some reviews (e.g., Trujio Bueno, 2003; Nagendra & Sampoorna, 2009) and research papers (e.g., Rees et a., 1989; Paetou & Faurobert-Scho, 1998; Manso Sainz & Trujio Bueno, 1999, 2003, 2011; Sampoorna & Trujio Bueno, 2010; Anusha et a, 2011). To the best of our knowedge this is the first time that a computer program suitabe for massivey parae computers has been deveoped to sove the mutieve probem of the generation and transfer of spectra ine poarization resuting from scattering processes and the Hane and Zeeman effects in 3D stear atmosphere modes. The probem of the generation and transfer of spectra ine poarization with partia frequency redistribution (PRD) is being increasingy considered in the iterature (e.g., Sampoorna et a, 2010; Beuzzi et a., 2012), but assuming reativey simpe mode atoms suitabe ony for some resonance ines. After presenting in Sect. 2 the formuation of the RT probem, in Sect. 3 we expain our forma sover of the 3D Stokesvector transfer equation, which is based on Auer s (2003) suggestion of monotonic Bézier interpoation within the framework of the short-characteristics approach. An additiona important point is the paraeization strategy we have deveoped for taking advantage of massivey parae computers, which we detai in Sect. 4 foowing our expanation of the forma sover. The iterative method we have impemented, expained in Sect. 5, is based on the non-inear mutigrid method for radiative transfer appications proposed by Fabiani Bendicho et a. (1997). We present usefu benchmarks and comparisons of the mutigrid iterative option of our code with another option based on the Jacobian iterative method on which one-dimensiona (1D) mutieve codes for soving the non-lte probem of the 2nd kind are based (e.g., Manso Sainz & Trujio Bueno, 2003; Štěpán & Trujio Bueno, 2011). Finay, in Sect. 6 we present our concusions with a view to future research. We have aready appied PORTA to investigate the intensity and inear poarization of some strong chromospheric ines in a mode of the extended soar atmosphere resuting from stateof-the-art 3D magneto-hydrodynamic (MHD) simuations (e.g., Štěpán et a. 2012). However, for the benchmarks presented in this paper, whose aim is a detaied description of PORTA, we have found it more suitabe to choose the 3D mode atmosphere and the five-eve atomic mode detaied in Appendix A. The software and hardware toos are summarized in Appendix B. 2. The radiative transfer probem The mutieve non-lte probem considered here for the generation and transfer of spectra ine poarization is that outined in Sect. 3 of Štěpán & Trujio Bueno (2011), where we assumed one-dimensiona (1D), pane-parae atmospheric modes (see aso Manso Sainz & Trujio Bueno, 2003), whie the aim here is to describe the computer program we have deveoped for soving in Cartesian coordinates the same mutieve radiative transfer probem but in three-dimensiona (3D) stear atmospheric modes. As shown beow, the deveopment of a robust mutieve 3D code is not simpy an incrementa step with respect to the 1D case, given the need to deveop and impement an accurate 3D forma sover, a highy convergent iterative scheme based on mutipe grids, and a suitabe paraeization strategy to take advantage of today s massivey parae computers. A detaied presentation of a the physics and reevant equations necessary to understand the radiation transfer probem soved in this paper can be found in Chapter 7 of LL04. Our aim here is to sove jointy the Stokes-vector transfer equation (corresponding to each radiative transition in the mode atom under consideration) and the statistica equiibrium and conservation equations for the mutipoar components of the atomic density matrix (corresponding to each eve J). We take into account the possibiity of quantum coherence (or interference) between pairs of magnetic subeves pertaining to any given J eve, but negect quantum interference between subeves pertaining to different J eves. Negecting J-state interference is a very suitabe approximation for modeing the ine-core poarization, which is where the Hane effect in most soar spectra ines operates (see Beuzzi & Trujio Bueno, 2011). In the absence of J-state interference, the genera number of ρ K Q (J) unknowns for each eve J is (2J +1) 2, at each spatia grid point. We note that in the unpoarized case there is ony one unknown associated to each J eve (i.e., ρ 0 0 (J)). In this paper we focus on the mutieve mode atom (see Sects. 7.3 and 7.13c of LL04), in which quantum interference between subeves pertaining to different J eves are negected. However, it is important to note that the same iterative method, forma sover, and the overa ogica structure of our code PORTA are very suitabe for soving the same type of probem but considering other mode atoms and/or magnetic fied regimes (see Chapter 7 of LL04). The emission vector and the propagation matrix of the Stokes-vector transfer equation depend on the oca vaues of the ρ K Q eements (with K = 0,..., 2J and Q = K,..., K) of the upper (i) and ower ( j) ine eves (see Sect. 7.2.b in LL04). Given an estimation of these ρ K Q (J) eements for each J eve at a spatia grid points, the forma soution of the Stokes-vector transfer equation for each radiative transition aows us to obtain the ensuing Stokes parameters at each spatia point within the medium, for each discretized frequency and ray direction. After ange and frequency integration one can obtain the radiation fied tensors 3 JQ K (i j) = p=0 dω 4π T Q K (p, Ω) dν I p (ν, Ω)φ i j (ν ν i j ), (1) where φ i j is the ine absorption profie and TQ K (p, Ω) the spherica irreducibe tensors given in Tabe 5.6 of LL04, and (I 0, I 1, I 2, I 3 ) T (I, Q, U, V) T is the so-caed Stokes vector. These radiation fied tensors, defined for K = 0, 1, 2 and Q = K,..., K, specify the symmetry properties of the radiation fied that iuminates each point within the medium. These quantities 2

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization are of fundamenta importance because they determine the radiative rates that enter the statistica equiibrium equations (see Sect. 7.2.a in LL04). After the discretization of the spatia dependence these equations can be expressed as L ρ = f, (2) where L is a bock-diagona matrix formed by N submatrices, N being the number of points of the spatia grid of resoution eve (the arger the positive integer number the finer the grid). NL NL is the size of each submatrix, NL being the tota number of ρ K Q unknowns at each spatia grid point. The ength of the vector ρ of ρ K Q unknowns and of the known vector f is NL N. The coefficients of the bock-diagona matrix L depend on the coisiona rates, which depend on the oca vaues of the thermodynamica variabes, and on the radiative rates, which depend on the radiation fied tensors JQ K (i j), whose computation requires soving the Stokes-vector transfer equation for each radiative transition i j. Since the radiative transfer coefficients depend on the ρ K Q unknowns the probem is non-inear, in addition to non-oca. To sove this type of probem we need a fast and accurate forma sover of the Stokes-vector transfer equation and a suitabe iterative method capabe of finding rapidy the density matrix eements ρ these that Eq. (2) is satisfied when the radiation fied tensors, which appear in the bock-diagona matrix L, are cacuated from such ρ eements via the soution of the Stokesvector transfer equation. The 1D mutieve code described in Appendix A of Štěpán & Trujio Bueno (2011) is based on the DELOPAR forma sover proposed by Trujio Bueno (2003) and on a Jacobian iterative scheme, simiar to that appied by Manso Sainz & Trujio Bueno (2003), but generaized to the case of overapping transitions. We turn now to expain the new forma sover we have deveoped for 3D Cartesian grids that is based on monotonic Bézier interpoation. 3. BESSER: Monotonic Bézier forma sover of the Stokes-vector transfer equation The transfer equation for the Stokes vector I = (I, Q, U, V) T can be written (e.g., Rees et a., 1989; Trujio Bueno, 2003) d dτ I = I S eff, (3) where τ is the optica distance aong the ray under consideration (dτ = η I ds, with s the geometrica distance aong the ray and η I the diagona eement of the 4 4 propagation matrix K), S eff = S K I being K = K/η I 1 (where 1 is the unit matrix and S = ɛ/η I, with ɛ = (ɛ I, ɛ Q, ɛ U, ɛ V ) T the emission vector resuting from spontaneous emission events). The forma soution of this equation is I O = I M e τ MO + τmo 0 dt [ S(t) K (t)i(t) ] e t, (4) where the ray or penci of radiation of frequency ν propagates aong the direction Ω, from the upwind point M (where the Stokes vector I M is assumed to be known) towards the spatia point O of interest (where the Stokes vector I O is sought), and t is the optica path aong the ray (measured from O to M; see Fig. 1). The numerica soution of Eq. (4) aows us to obtain, from the current estimates of the emission vector ɛ and propagation Fig. 1. Short-characteristics in a three-dimensiona Cartesian rectiinear grid. matrix K, the Stokes parameters at each spatia grid point O within the 3D medium, for a discretized radiation frequencies and directions. We note that the unpoarized version of Eq. (4) can be easiy obtained by taking I O I O, I M I M, S(t) S (t), and K (t) 0. 3.1. The short characteristics method The short-characteristics (SC) method was proposed by Kunasz & Auer (1988) to sove the unpoarized version of Eq. (4) for the specific intensity (see aso Auer & Paetou, 1994; Auer et a, 1994; Fabiani Bendicho & Trujio Bueno, 1999). Consider three consecutive spatia points M, O, and P aong the ray under consideration, with M the upwind point, P the downwind point, and O the point where the Stokes I parameter is being sought, for the given frequency and ray direction (see Fig. 1). The aim of the origina SC method is to sove the unpoarized version of Eq. (4) aong the MO segment in order to compute the specific intensity I(ν, Ω). The origina SC method is based on the approximation of paraboic interpoation of the source function S (t) between the upwind point M of the short characteristics, the grid point O, and the downwind point P (see Fig. 1). In 2D and 3D grids, the upwind and downwind points of the SC do not generay coincide with any spatia grid node and the radiation transfer quantities (i.e., the emission and absorption coefficients) have to be interpoated from the nearby 9-point (biquadratic case) or 4- point (biinear case) stencis of the discrete grid points. In the unpoarized and poarized options of PORTA both biquadratic and biinear interpoation are impemented. Biinear interpoation is sufficient in the fine grids of today s MHD modes. The upwind specific intensity or the Stokes vector I M need to be interpoated from the same grid nodes. Proper topoogica ordering of the grid points is therefore necessary for every direction of the short characteristics. We note that the intersection points M and P may be ocated on a vertica pane of the grid instead of a horizonta one. The DELO method proposed by Rees et a. (1989) can be considered a possibe generaization of the scaar SC method to the radiative transfer probem of poarized radiation. That forma sover of the Stokes-vector transfer equation is, however, based on inear interpoation of the source function S eff between points M and O. Trujio Bueno (2003) demonstrated that significanty more accurate soutions can be obtained by using instead a forma sover he cas DELOPAR, which is based on the choice in Eq. (4) of paraboic interpoation for S (between points M, O, and P) and inear interpoation for K I (between points M and O). He showed that with DELOPAR the accuracy of the sef-consistent soution rapidy improves as the spatia resoution eve of the spatia grid is increased. The first version of 3

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization Fig. 2. Paraboic and BESSER interpoation using the three successive points M, O, and P. Dotted ine: paraboic interpoation may create a spurious extremum between points M and O. Soid ine: interpoation using our BESSER method with continuous derivative at point O. The contro points of the intervas, whose y-coordinates are denoted by c M and c P, define tangents to the Bézier spines in their endpoints. The x-coordinates of the contro points are ocated at the center of the corresponding intervas. the computer program NICOLE of Socas Navarro et a. (2000), for the synthesis and inversion of Stokes profies induced by the Zeeman effect, used DELOPAR as the forma sover. In smooth and/or suitaby discretized mode atmospheres, the DELOPAR method provides accurate resuts. However, in the presence of abrupt changes in the physica properties of the atmospheric mode, the paraboic interpoation suffers from non-monotonic interpoation between otherwise monotonic sequences of discrete points. Such spurious extrema of the interpoant decrease the accuracy of the soution and can aso ead to unreaistic or even unphysica Stokes parameters at the grid point O under consideration (see the dotted ine in Fig. 2). In addition, the paraboic interpoation may occasionay ead to the divergence of the whoe numerica soution. To overcome these difficuties, Auer (2003) suggested an interpoation based on the use of monotonic Bézier spines. Some forma sovers based on this idea have aready been impemented (Koesterke et a., 2008; Hayek, 2008; Štěpán & Trujio Bueno, 2012; Hozreuter & Soanki, 2012; de a Cruz Rodríguez & Piskunov, 2013). In this section, we describe in detai the accurate forma sover we have deveoped, pointing out a significant difference with the origina proposa of Auer (2003). We ca it BESSER (BEzier Spine SovER). 3.2. Monotonic spine interpoation with continuous first derivative Foowing Auer (2003), our BESSER agorithm is based on the use of piecewise monotonic quadratic Bézier spines. The contro points of the spines can be used to preserve monotonicity of the interpoant, because the interpoant is contained in an enveope defined by the tangents of the spine in its endpoints and of the contro point which is ocated in the intersection of these tangents (see Fig. 2). As shown beow, we achieve a smooth connection of the Bézier spines in the centra point O by imposing a continuous first derivative of the interpoant. This improvement over the origina treatment of Auer (2003) eads to a symmetrica interpoation independenty of the choice of the interpoation Fig. 3. The treatment of an overshoot in the downwind interva OP by three different methods. Back soid ine: BESSER impementation with continuous derivative at point O and the c P overshoot correction of the contro point. Crosses: piecewise monotonic quadratic Bézier spine interpoation. Soid gray ine: paraboic interpoation. We note that the piecewise monotonic quadratic Bézier interpoation coincides with the paraboic interpoation in the MO segment because the overshoot in the OP interva does not affect the upwind interpoation between M and O. direction (MOP for one direction of the ray propagation or POM for the opposite direction of the ray). An additiona attractive feature is that our BESSER method aways provides reiabe vaues for the diagona of the Λ-operator, i.e., in the interva [0, 1), used in methods based on the Jacobi iteration. Given a quantity y (e.g., the source function) defined at three successive points x M, x O, and x P, we use two quadratic Bézier spines to interpoate y between points M and O and between points O and P (see Fig. 2). First, we ook for an optima interpoation in the interva MO. For the sake of simpicity, we parametrize the x-coordinate in this interva by a dimensioness parameter u = (x x M )/h M, where h M = x O x M. The Bézier spine in the interva MO is a paraboa passing through points M and O. The derivatives at such points are defined by the position of the contro point whose y-coordinate is c M (see Fig. 2). The equation for such a spine reads (Auer, 2003) y(u) = (1 u) 2 y M + 2u(1 u)c M + u 2 y O, u [0, 1]. (5) Simiary, one can define a Bézier spine between points O and P by doing the forma changes y M y O, y O y P, and u = (x x O )/h P, where h P = x P x O ; the y-coordinate of the ensuing contro point is denoted by c P (see Fig. 2). We ook for the vaues of c M and c P that satisfy the foowing conditions: (1) if the sequence y M, y O, y P is monotonic, then the interpoation is monotonic in the whoe interva [x M, x P ]; (2) if the sequence of y i vaues is not monotonic, then the interpoant has the ony oca extremum at O; and (3) the first derivative of the interpoant at point O shoud be continuous. The ensuing agorithm proceeds as foows: 1. Cacuate the quantities d M = (y O y M )/h M, d P = (y P y O )/h P. 2. If the sequence y M, y O, y P is not monotonic (i.e., if d M d P 0), then set c M = c P = y O and exit the agorithm. The derivative of the spines at point O is equa to zero, eading to a oca extremum at the centra point. 4

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization 3. Estimate the derivative at point O, y O = h Md P + h P d M h M + h P (6) (see Eq. 7 of Auer, 2003, and references therein). This derivative is equa to that provided at the same point by paraboic interpoation among points M, O, and P. Moreover, in contrast with Eq. (12) of Auer (2003), it is an expression that reates y (e.g., the source function derivative) ineary with the y-vaues (e.g., with the source function vaues). 4. Cacuate the initia positions of the contro points, c M = y O h M2 y O, and c P = y O + h P 2 y O.1 5. Check that min(y M, y O ) c M max(y M, y O ). If the condition is satisfied, then go to step 7, otherwise continue with step 6. 6. If the condition in step 5 is not satisfied, then there is an overshoot of the interpoant in the interva MO. Set c M = y M, so that the first derivative at M is equa to zero and the overshoot is corrected. Since the vaue of c P is not of interest for the forma soution between M and O, exit the agorithm. 7. Check if min(y O, y P ) c P max(y O, y P ). If this condition is not satisfied, then set c P = y P so that the overshoot in the interva OP is corrected. 8. Cacuate the new derivative at O, y O = (c P y O )/(h P /2), using the corrected vaue of c P cacuated in step 7. 9. Cacuate a new c M vaue to keep the derivative at O smooth. It is easy to reaize that this change cannot produce an overshoot in the MO interva, hence the soution remains monotonic with a continuous derivative. Steps 8 and 9 of the above-mentioned agorithm, deaing with correction of the overshoots in the downwind interva foowed by modification of the c M upwind contro point vaue, are not part of the origina agorithm of Auer (2003) in which the derivative at point O can be discontinuous. We have found that it is suitabe to guarantee the smoothness of the derivative, and this can be done with ony a sma increase in the computing time with respect to the DELOPAR method. Our BESSER interpoation is stabe; that is, the interpoant varies smoothy with smooth changes of the M, O, and P points. No abrupt changes of the spines occur that coud negativey affect the stabiity of the iterative method. In contrast to some other forma sovers based on the idea of quadratic Bézier spines (e.g., one of the two Bezier methods discussed by de a Cruz Rodríguez & Piskunov, 2013), our BESSER agorithm guarantees that a monotonic sequence of the MOP points eads to a monotonic interpoant in a situations. This fact is of critica importance in 2D and 3D grids in which τ MO and τ OP may differ significanty because of unpredictabe intersections of the grid panes, especiay if periodic boundary conditions are considered. Such arge differences often ead to overshoots, uness treated by BESSER or a simiary suitabe strategy. Forma sovers based on cubic Bezier spines (e.g., de a Cruz Rodríguez & Piskunov, 2013) coud be deveoped to preserve the continuity of y O, but they may fai to accuratey interpoate quadratic functions even in fine grids when using Auer s (2003) Eq. (12) for y O (see Sect. 3.4). An aternative forma sover, which uses cubic Hermite spines, has been presented by Ibgui et a. (2013). However, the 1 The contro points cacuated this way ead to a unique paraboic interpoation among points MOP. If the agorithm is stopped here, the resuting forma sover woud be equivaent to the standard paraboic interpoation. way of fixing the derivatives at the end points M and P of the SC to the vaues corresponding to the inear interpoation case may cause oss of accuracy of the forma sover. 3.3. Forma soution of the vectoria radiative transfer equation with BESSER The appication of the Bézier interpoation for cacuating the forma soution of Eq. (4) proceeds as foows. We assume that the Stokes components of the vectoria source function S(t) vary, between points M and O, according to Eq. (5) with the contro points cacuated using the BESSER agorithm described in the previous section. The term K (t)i(t) is assumed to change ineary in the same interva, as in the DELOPAR method. The integra in Eq. (4) can then be evauated anayticay and the Stokes parameters at point O can be expressed in the form (see Trujio Bueno, 2003, for detais of an anaogous derivation using paraboic interpoation of S) κ 1 I O = [ e τ MO ψ M K M] IM + ω M S M + ω O S O + ω C c M, (7) where κ 1 = 1 + ψ O K O (8) is a 4 4 matrix and 1 is the unit matrix. Mutipying Eq. (7) by κ gives the desired vector of Stokes parameters I O at point O. The coefficients ψ M and ψ O are the usua coefficients resuting from inear interpoation, ψ M = 1 e τ MO (1 + τ MO ) τ MO, (9) ψ O = e τ MO + τ MO 1 τ MO. (10) Using the substitutions h M = τ MO and u = 1 t/τ MO in Eq. (5), one obtains for the ω i coefficients the expressions ω M = 2 e τ MO (τ 2 MO + 2τ MO + 2) τ 2 MO ω O = 1 2 e τ MO + τ MO 1 τ 2 MO ω C = 2 τ MO 2 + e τ MO (τ MO + 2) τ 2 MO, (11), (12). (13) It is important to note that the accuracy of these expressions decreases as τ MO 1, due to the imited precision of the foating point computer arithmetics. Therefore, for sma upwind optica paths we use instead the Tayor expansion of such expressions cacuated at τ MO = 0 (see Tabe 1). An important quantity used in Jacobi-based iterative methods for the soution of non-lte probems is the diagona of the monochromatic Λ operator at the point O under consideration, Λ Ων. It can be easiy cacuated from Eq. (7) for I O, by setting I M = S M = S P = (0, 0, 0, 0) T and S O = (1, 0, 0, 0) T. It foows that Λ Ων = ω O + ω C c M. Given that in this case the source function has a oca maximum at point O, we have c M = 1, and we finay arrive at Λ Ων = ω O + ω C = 1 + 2 e τ MO (1 + τ MO ) 1 τ 2 MO. (14) In contrast to the famiiar paraboic sovers, no information about the interpoation coefficients in the preceding point is 5

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization Tabe 1. Tayor expansion of the ω M,O,P coefficients for sma optica path intervas Coefficient max t Expansion ω M 0.14 (t(t(t(t(t(t((140 18t)t 945) + 5400) 25200) + 90720) 226800) + 302400))/907200 ω O 0.18 (t(t(t(t(t(t((10 t)t 90) + 720) 5040) + 30240) 151200) + 604800))/1814400 ω P 0.18 (t(t(t(t(t(t((35 4t)t 270) + 1800) 10080) + 45360) 151200) + 302400))/907200 Notes. Tayor expansion of the interpoation coefficients of the BESSER method for sma τ MO vaues (for the sake of notationa simpicity we use t τ MO ). Coumn 2 of the tabe indicates the approximate maximum vaue of τ MO for which this expansion is more accurate, using doube precision arithmetics, than the expressions given by Eqs. (11 13). We use the Horner rue in Co. 3, which provides a better numerica accuracy and aso reduces the number of mutipications in comparison to the expicit expansion of the Tayor power series. Fig. 4. Each pair of curves shows, for a spatia grid of given resoution, the maximum reative change versus the iteration number using the Jacobi method with DELOPAR (dotted ines) and with BESSER (soid ines). We note that the finer the grid the sower the convergence rate, but that for any given spatia resoution both forma sovers give the same convergence rate. needed to determine the diagona of the Λ operator at the point O under consideration. It is easy to show that Λ Ων [0, 1), which is an important condition for the stabiity of the iterative method used to sove any non-lte probem. This is particuary important for soving three-dimensiona probems in which the upwind point M does not generay coincide with any grid node. We have found by numerica experimentation that the time needed to perform one Jacobi iteration using our BESSER forma sover is ony 1% sower than when using instead DELOPAR, because of the need to determine the c M and c P vaues of the contro points foowing the agorithm described in Sect. 3.2. The computation of κ and κ 1 (see Eq. 8), the cacuation of the transfer coefficients, and their interpoation in the upwind and downwind points takes most of the computing time per iterative step. It is aso important to note that the convergence rate of mutieve iterative methods using BESSER as forma sover is virtuay identica to that achieved using DELOPAR (see an exampe of the Jacobi method in Fig. 4). If the atmospheric mode used is sufficienty smooth and no abrupt changes of the source function are present, the accuracy of BESSER and DELOPAR are virtuay identica. This is not surprising because both forma sovers produce identica resuts in the absence of overshoots. Fig. 5. Variation aong the ray direction of the source function (Eq. 15) and of the corresponding specific intensity (Eq. 17) cacuated anayticay. 3.4. Accuracy of the BESSER forma sover To demonstrate the accuracy of our BESSER forma sover we consider the RT probem of an arbitrary ray propagating in an infinite medium having constant opacity and a source function variation aong the ray direction given by the expression (for the sake of simpicity we consider the unpoarized case) S (z) = σ(2, 5, z)σ( 2, 5, z), (15) where the sigmoid function σ reads σ(a, d, z) = 1 1 + e a(z d) (16) and z is the geometrica distance aong the ray, measured in units of the ength scae for which z = τ, with z the grid spacing and τ the ensuing optica distance. As shown by the soid ine in Fig. 5, which corresponds to d = 5 in Eq. (16), the source function exponentiay rises around z = d, reaches its maximum vaue around z = 0, and then exponentiay decreases around z = d. Assuming I a ( ) = 0, the anaytica soution of the radiative transfer equation for the specific intensity propagating towards positive z vaues is (see the dotted ine in Fig. 5) I a (z) = e15 z ( e 10 arctan e z 5 arctan e z+5). (17) e 20 1 We have cacuated numericay the specific intensity for the above-mentioned one-ray probem by soving the radiative transfer equation in severa spatia grids of increasing resoution and using various forma sovers. Our aim is to compare the accuracy 6

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization Fig. 6. Maximum reative true error E( τ) cacuated as a function of the uniform grid spacing τ, using different forma sovers. Soid ine: our BESSER method. Dotted ine: quadratic Bézier with the derivative at point O cacuated using the expression given by Fritsch & Butand (1984) (see aso Eq. 12 of Auer, 2003). Dashed ine: as in the previous method, but appying the c M overshoot correction (see Eq. 11 of Auer, 2003). Dasheddotted ine: standard SC method with inear interpoation. Threedotted-dashed ine: standard SC method with paraboic interpoation. of our BESSER forma sover with other short-characteristics methods. To this end, we use the anaytica soution given by Eq. (17) to compute the maximum true error E( τ) = Max I(z) I a(z), (18) I a (z) among a the spatia points aong the ray for which the soution has been obtained (i.e., 12 z 12). 2 The forma sovers we have appied are isted in the caption of Fig. 6, which gives E( τ) as a function of the τ of the grid spacing. Surprisingy, the worst performance is that corresponding to the Bézier forma sover based on the centra-point derivative y O, cacuated using the weighted harmonic mean derivatives of Fritsch & Butand (1984) and ignoring the overshoot test in the upwind interva (Bezier, dotted ine). If the correction to the upwind overshoot is appied, the method performs much better, at east in the coarsest grids in Fig. 6 (CBezier, dashed ine). In finer grids, however, the accuracy is sti ower than that of BESSER and even than that provided by the standard SC method with paraboic interpoation (dashed-three-dotted ine). The reason is that the estimation of the centra-point derivative y O provided by Fritsch & Butand (1984) (see aso Eq. 12 of Auer, 2003) generay does not aow the second-order poynomias to be interpoated exacty. 3 In the coarsest grids in Fig. 6, the maximum true error depends not ony on the grid spacing but aso on the particuar position of the grid points with respect to the z = 0 position of the source function maximum. Consequenty, in the region in Fig. 6 corresponding to the coarsest grids we observed an osciatory 2 In the numerica cacuation, we use the boundary intensity I( 12) = I a ( 12) 2.77176 10 7. 3 We point out, however, that the quadratic Bezier method seems to provide reiabe resuts in some cases of practica interest (de a Cruz Rodríguez & Piskunov, 2013). Fig. 7. Domain decomposition in the z-axis, with N denoting the number of discrete heights within domain D. The soid ine z N = z +1 1 indicates the boundary ayer of the domains D and D +1, whie the dashed ines indicate the ghost ayers z N 1 and z +1 2. behavior of the maximum true error. However, the overa variation of the error with τ remains the same independent of the particuar ocation of the grid nodes. 4. Paraeization using The Snake Agorithm The sowest part in the numerica computations needed for soving a non-lte radiative transfer probem is the forma soution of the radiative transfer equation because the number of foating point operations needed to compute the radiation fied at a the spatia grid points far exceeds the number of operations needed to sove the SEEs. In particuar, an accurate modeing of the spectra ine poarization produced by anisotropic radiation pumping (scattering ine poarization) and its modification by the Hane effect requires the use of very fine frequency and direction quadratures that increase the computing time of the forma soution. The forma soution for computing a singe Stokes parameter at any given grid point for any given frequency and ray direction typicay takes about 1 µs on today s workstations. It is easy to estimate that one forma soution for computing the four Stokes parameters in a 3D grid with 500 3 points, 100 radiation frequencies, and 160 discrete directions wi take about 90 days on the same computer. The fu non-lte soution requiring one hundred Jacobi iterations woud take 25 years. The use of powerfu methods for mutieve radiative transfer appications, such as the non-inear mutigrid method proposed by Fabiani Bendicho et a. (1997), is necessary but not sufficient for doing mutieve radiative transfer cacuations in reaistic 3D atmospheric modes resuting from state-of-the-art MHD simuations. To this end, we need to use massivey parae computers, which requires a suitabe paraeization of the RT code. In this section, we describe a nove agorithm for performing the numerica forma soution using mutipe CPU cores. As shown beow, simutaneous paraeization via domain decomposition and paraeization in the frequency domain resuts in a very efficient radiative transfer code that shows an optima scaing with the number of CPU cores. 4.1. Domain decomposition The computer memory needed tp store arge mode grids exceeds the capacity of the computing nodes of today s supercomputers 7

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization by at east one order of magnitude. The capacity of computers wi continue to increase in the future, but the same wi happen with the scae and resoution of the MHD modes. It is therefore necessary to reduce the memory demands per CPU core. This can be achieved through the technique of domain decomposition, by means of which different parts of the mode grid are treated simutaneousy (in parae), each running on a different CPU core. This task is non-trivia in radiative transfer because of the need to use a we-defined sequence of grid points. This compicates the treatment of radiative transfer in generay decomposed grids where different parts are to be soved simutaneousy. A possibe soution to this probem comes from the fact that the non-lte probem needs to be soved by iteration and the radiation fied at the domain boundaries can be fixed in every iteration using the radiation fied cacuated in the previous iteration. After a sufficient number of iterations, a sef-consistent soution to the probem is eventuay found. The disadvantage of this approach is that fixing the boundary radiation fied of the domains reduces the information fow between the domains, which eads to a scaing of the agorithm proportiona to P 2/3, with P the number of CPU cores (Hayek, 2008). In Sect. 4.4 we discuss the advantages and disadvantages of our approach. Given a Cartesian grid with N x N y N z discrete points, we ony divide it aong the z-axis into a consecutive sequence of L domains D, = 1,..., L (see Fig. 7, which shows a 2D instead of a 3D grid, for simpicity). The horizonta extension of each domain D is aways the same, and identica to that corresponding to a seria soution of the same non-lte probem. Each of these domains is treated by one or more CPU cores in parae with others according to the agorithm described in the foowing section. The boundary ayer z N = z +1 1 of the successive domains D and D +1 has to be taken into account in each of the domains. Ghost ayers have to be incuded in both domains if the forma sover of the transfer equation is of paraboic accuracy and/or the mutigrid method is used. This ghost ayer is needed to cacuate the radiation transfer coefficients at the downwind point P (see Fig. 1) when point O is in the boundary ayer z N = z +1 1. We note that if the interpoation of the upwind and downwind radiation transfer coefficients is biinear instead of biquadratic, ony one ghost ayer is needed for each of these domains. This is usuay a good approximation given the spatia fineness of today s MHD modes. Given that the boundary ayer that is common to each pair of domains has to be treated twice, the number N of z-points per domain is equa to (N z 1)/L + 1 (assuming that L is such that (N z 1)/L is an integer). As shown beow, it is possibe to divide the z-axis into a arge number of intervas without any serious effect on the efficiency. In the numerica experiments discussed beow, we have used vaues as sma as (N z 1)/L + 1 = 6. For a sufficient number of computing nodes, it foows that the memory requirements per domain scaes as O(N x N y ). Given the arge spatia extension of the 3D stear atmospheric modes that resut from today s MHD simuations (e.g., Leenaarts et a., 2012), the domain decomposition strategy described above is very suitabe for reducing to reasonabe vaues the memory requirements per computing core. 4.2. 3D forma soution in the domain-decomposed modes: The Snake Agorithm In contrast to the usua 3D domain decomposition technique, it is possibe to fufi the requirement of a topoogicay sorted grid without the need to iterate the boundary conditions. For reasons that wi become obvious beow, we ca it the Snake Agorithm (SA). It proceeds as foows. The forma soution of the RT equation aows us to obtain, at each spatia grid point (i x, i y, i z ) of domain D, the Stokes parameters for a the discretized directions and radiation frequencies (Ω i, ν j ). The tota number of these points is N Ω N ν, where N Ω denotes the number of ray directions and N ν the number of radiation frequencies. Without oss of generaity, et us consider the forma soution of the RT equation in domain D for the directions Ω i having Ω z > 0 (i.e., for rays propagating aong directions of increasing z, from the ower boundary of D 1 to the upper boundary of D L ). For (Ω 1, ν 1 ), we sove the RT equation starting at the ower boundary z 1 1 and proceeding upwards to the domain boundary ayer z 1 N 1 (see Fig. 7). If the atmospheric mode assumes periodic boundary conditions in the horizonta (x, y) directions, we take them into account foowing the strategy of Auer et a. (1994). Since the domain is not decomposed in the horizonta directions, in each of the domains our agorithm works exacty as it does in the seria soution. Once the radiation fied for (Ω 1, ν 1 ) is known at the ast pane z 1 N 1 of the domain, we start the process responsibe for doing the forma soution in the next domain D 2. In addition to the Stokes parameters, the Λ Ω 1 ν 1 vaue has to be provided to the next domain. At this point, the process D 1 starts soving the radiative transfer equation for (Ω 1, ν 2 ), beginning again at the ower boundary. After reaching the z 1 N 1 pane, the radiative data are provided to the D 2 domain and the soution continues with (Ω 1, ν 3 ). These steps are repeated unti the radiation transfer equation is soved for a the discrete frequencies. Then, it continues in an anaogous way with (Ω 2, ν 1 ) and for a the directions with Ω z > 0. The soution in domain D 2 proceeds in an exacty anaogous way. After receiving the radiation data (Ω 1, ν 1 ) from domain D 1, the RT equation is soved in panes z 2 2, z2 3, etc., up to the ayer z2 N 2, from which the resuting radiation fied and Λ Ω 1 ν 1 (x, y, z 2 N 2 ) are propagated to the grid points (x, y, z 3 1 ) of domain D 3. At a given time, each domain soves the RT equation for different (Ω i, ν j ), such that the difference between two successive processes (domains) is just one step in the discrete space of directions and frequencies. The outgoing radiation from one domain becomes the incoming radiation for the foowing domain. The resuting snake of ength L cambers haf of the parameter space of directions Ω z > 0 and, after this is finished, it proceeds back in an anaogous way by soving the radiative transfer probem for a Ω z < 0 directions. Figure 8 visuaizes the whoe process using as an exampe a forma soution with five radiation frequencies ν j=1,...,5 and six directions Ω i=1,...,6, running in a domain-decomposed grid with six domains, each of which is indicated by a numbered rectange. Each step of the soution in every domain corresponds to a forma soution of the RT equation for one direction and one frequency. In the next step, the snake of processes moves by one (Ω i, ν j ) point and the radiation fied data are passed between the successive domains. In this exampe, every singe process soves the RT probem in the dedicated domain which contains N/6 grid nodes. These processes parse the discrete space of directions and frequencies in the we-defined order indicated in the figure, unti the whoe direction-frequency space is passed through by a the processes. At the beginning and at the end of the forma soution, some of the processes are inactive, waiting for other processes to finish their work. The tota number of time steps of the forma soution is 35 for the 30 direction-frequency points, 8

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization Fig. 8. Carification of the Snake Agorithm (SA) using an exampe of a forma soution with five radiation frequencies ν j=1,...,5 and six directions Ω i=1,...,6, running in a domain-decomposed grid with six domains. We note that ony the soution for rays Ω z > 0 is shown in this figure. See the main text for detais. which impies a speedup factor 30 6/35 5.1 with respect to the seria soution. This can be easiy verified by using Eq. (24) with N Ω = 12 (see beow). If the domains have the same or simiar N vaues (which is easy to achieve in practice), each process (i.e., each domain) soves ony the fraction 1/L of the whoe radiation transfer probem. This eads, in principe, to an amost inear scaing with the number of spatia domains. For practica reasons (i.e., optimization in the treatment of ine absorption profies, because it is not practica to store them for every grid point and ray direction), the ine absorption profies are obtained by interpoation, using a pre-cacuated database created at the beginning of the non-lte soution. This impies some significant reduction of memory and computing time. After cacuating a ine profie from the database (where the profies are normaized to unit area), one may need to renormaize it using the chosen frequency quadrature (e.g., in the presence of Dopper shift gradients caused by macroscopic pasma motions). This ony needs to be done once per direction if the oop over frequencies is the inner oop because the normaization factor ony depends on direction. In summary, it is more convenient if the directions are in the outer oop and the frequencies in the inner oop of the agorithm, so that our snake paraeization strategy proceeds row by row as indicated in Fig. 8, instead of coumn by coumn. Concerning the impementation of the agorithm, it is important to note that it is crucia to use non-bocking routines to propagate the radiation data between the successive domains. In other words, the RT cacuation in domain D proceeds by soving the next (Ω i, ν j ) point immediatey after the ower-boundary radiation data from domain D 1 arrives. It does not have to wait for D +1 to retrieve the (Ω i, ν j 1 ) data. Consequenty, the snake in Fig. 8 can temporariy become spit, with two successive processes and + 1 processing non-subsequent points in the discrete Ω ν space (see Fig. 8). If this does not happen, the computing performance can decrease significanty because a significant amount of time is spent waiting for the synchronization of the whoe grid. 4.2.1. Scaing of the agorithm In the seria soution, the computing time needed for the forma soution of the RT equation is proportiona to the tota number of spatia grid points N = N x N y N z, the number of directions N Ω, Fig. 9. Speedup S (L) of the soution of the RT equation due to domain decomposition with the Snake Agorithm. The number of CPU cores on the horizonta axis is equa to the number L of spatia domains. The diagona dotted ine indicates the theoretica curve of inear scaing. The scaing of the agorithm is amost a inear function of the number of domains. The sma departure from inearity is mainy due to the cost of the inter-process communication. and the number of radiation frequencies N ν. We sha denote this computing time by T 1 = αnn Ω N ν, (19) with α a constant of proportionaity. In the domain-decomposed parae soution (sti assuming that the first haf of the integrations is performed aong the directions having Ω z > 0), the duration of the fu forma soution in the whoe grid is equa to the time interva between the first (Ω 1, ν 1 ) and the ast (Ω NΩ, ν Nν ) ray integrations in the domain D 1. Given that the number of grid nodes in domain D 1 is equa to N/L, the time spent soving the first haf of the rays in domain D 1 is t a = α N L N Ω 2 N ν. (20) 9

Jiří Štěpán and Javier Trujio Bueno: 3D Mutieve Radiative Transfer with Poarization The process responsibe for domain D 1 is waiting for the ast upward ray (Ω NΩ /2, ν Nν ) to propagate through L 1 domains to the upper grid boundary z Nz. Its duration equas t b = α(l 1) N L. (21) The same time t b it taken by the first downward ray (Ω NΩ /2+1, ν 1 ) to propagate from D L to the upper boundary of D 1. The computing time needed for the soution of the N Ω /2 downward rays in the D 1 domain is, again, equa to t a. The duration of the fu forma soution in the L-decomposed grid is, therefore, equa to T L = 2(t a + t b ). It foows from the equations above that T L = α N L [N ΩN ν + 2(L 1)]. (22) We define the speedup of the parae soution with respect to the seria soution as S (L) = T 1 T L, (23) which, using Eqs. (19) and (22), is equa to S (L) = L 1 1 + λ, (24) where λ = 2(L 1) N Ω N ν. (25) If λ 1, then the speedup in the forma soution is practicay inear with the number of domains L. This is equivaent to saying that N Ω N ν L, i.e., to a situation in which the number of direction-frequency points is much arger than the number of domains. Given that in the transfer of poarized radiation the typica orders of magnitude of the reevant quantities are N Ω 10 2, N ν 10 3, and L 10 2, we obtain λ 10 3. It is easy to see from Eq. (24) that SA aways acceerates the soution if N Ω > 2 and L > 1. Fig. 10. The Snake Agorithm appied to the probem in Fig. 8 with L = 3 and M = 5. Here, every process has a singe dedicated radiation frequency, i.e., N m = 1. Given the sma number of directions and frequencies in this iustrative exampe, we have λ = 2/3 and the speedup S (LM) = 9. See the text for detais. 4.3. Paraeization in the radiation frequencies The radiation frequencies ν i=1,...,nν can be grouped into M intervas, each containing the N m = N ν /M discrete frequencies. 4 The Snake Agorithm can be appied in parae to each of these frequency bocks. The ony difference with respect to the agorithm described in Sect. 4.2 is that the soution in the spatia domains D is ony performed in a sub-space of (Ω i, ν j ) in which j = j m 1,..., jm N m. Since this can be done in parae for a the M bocks, a significant reduction of the soution time can be achieved (see Fig. 10). The domain and frequency decomposition paraeization strategies described above are performed independenty of each other, in the sense that there is no need of communication between the processes treating different frequency intervas m during the forma soution. The radiation fied tensors JQ K and the Λ Ων operator needed for the soution of the statistica equiibrium equations are ony partiay integrated over the ine absorption profies during the forma soution and, at the end of the whoe forma soution process, are summed over the frequency 4 At east in the favorabe situation in which N m is an integer number. In genera, it is convenient that the individua frequency intervas have simiar engths. Fig. 11. Speedup S (M) of the forma soution of the RT equation due to paraeization in radiation frequencies in a singe spatia domain (L = 1). intervas and synchronized among them. The time cost of this operation is negigibe with respect to the time demands of the forma soution. Thanks to this orthogonaity of the two independent paraeizations, it is possibe to achieve a mutipicative effect of both speedup factors. 4.3.1. Scaing of the agorithm A reduction in the number of frequencies in every domain by a factor of 1/M gives a new soution time T ML which is obtained 10

arxiv: v2 [astro-ph.sr] 13 Aug 2013