Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy

Size: px

Start display at page:

Download "Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy"

Godwin Scott
6 years ago
Views:

1 ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 130 Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy S. VAN AERT, 1 A. J. DEN DEKKER, 2 A. VAN DEN BOS, 3 AND D. VAN DYCK 1 1 Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium 2 Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CJ Delft, The Netherlands 3 Faculty of Applied Sciences, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands I. Introduction A. Qualitative Atomic Resolution Transmission Electron Microscopy B. Quantitative Atomic Resolution Transmission Electron Microscopy C. Statistical Experimental Design II. Basic Principles of Statistical Experimental Design A. Introduction B. Parametric Statistical Models of Observations C. Attainable Precision The Cramér-Rao Lower Bound Precision Based Optimality Criteria D. Maximum Likelihood Estimation E. Conclusions III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models A. Introduction B. Parametric Statistical Models of Observations One-Dimensional Observations Two-Dimensional Observations Three-Dimensional Observations C. Approximations of the Cramér-Rao Lower Bound One-Dimensional Observations Two-Dimensional Observations Three-Dimensional Observations D. Discussions and Examples Two-Dimensional Observations Three-Dimensional Observations E. Conclusions IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy A. Introduction B. Parametric Statistical Model of Observations Copyright 2004, Elsevier Inc. All rights reserved. ISSN /04

2 2 VAN AERT ET AL. 1. The Exit Wave The Image Wave The Image Intensity Distribution The Image Recording The Incorporation of a Monochromator C. Statistical Experimental Design Microscope Settings Numerical Results Interpretation of the Results D. Conclusions V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy A. Introduction B. Parametric Statistical Model of Observations The Exit Wave The Image Intensity Distribution The Image Recording C. Statistical Experimental Design Microscope Parameters Numerical Results Interpretation of the Results D. Conclusions VI. Discussion and Conclusions Appendix A Appendix B Appendix C References I. Introduction In materials science, the last decades are characterized by an evolution from macro- to micro- and, more recently, to nanotechnology. In nanotechnology, nanomaterials play an important role. Examples of nanomaterials are nanoparticles, nanotubes, and layered magnetic and superconducting materials (Nalwa, 2002; van Tendeloo et al., 2000). The interesting properties of these materials are related to their structure. Therefore, one of the central issues in materials science is to understand the relations between the properties of a given material on the one hand and its structure on the other hand. A complete understanding of this relation, combined with recent progress in building nanomaterials atom by atom, will enable materials science to evolve into materials design, that is, from describing and understanding towards predicting materials with interesting properties (Browning et al., 2001; Olson, 1997, 2000; Reed and Tour, 2000; Wada,

3 QUANTITATIVE ATOMIC RESOLUTION TEM ). In order to understand the properties-structure relation, experimental and theoretical studies are needed (Muller and Mills, 1999; Spence, 1999; Springborg, 2000). Essentially, theoretical studies allow one to calculate the properties of materials with known structure, whereas experimental studies allow one to characterize materials in terms of structure. In practice, however, the combination of both approaches is not yet feasible. One of the reasons is that present experimental characterization methods may generally not locally determine atom positions within sub-ångstrom precision (Olson, 2000). A precision of the order of 0.01 to 0.1 Å is needed (Muller, 1998, 1999; Kisielowski, Principe, Freitag, and Hubert, 2001). Various experimental characterization methods exist. However, scanning probe techniques, such as scanning tunnelling and atomic force microscopy, restrict investigations to surface or near-surface regions (Wiesendanger, 1994). Hence, they cannot provide subsurface information. Classical X-ray and neutron divraction techniques, on the other hand, only provide averaged, instead of local, structure information (Zanchet and Ugarte, 2000). Therefore, they may only be applied successfully to periodic materials, such as crystals, whereas nanomaterials are usually aperiodical. Only atomic resolution transmission electron microscopy (TEM) techniques seem to be appropriate to provide local information to atomic scale since electrons interact suyciently strongly with materials (Fujita and Sumida, 1994), (Spence, 1999). Another advantage of electrons is that they are charged and can therefore give information about the ionization state of atoms. They can also be deflected by lenses yielding information both in real and Fourier space. Furthermore, as compared to X-rays or neutrons, electrons would even provide more structure information for a given amount of radiation damage (Henderson, 1995). Figure 1 presents a compact scheme of the collection of electron microscopical observations by means of atomic resolution TEM. The observations are two-dimensional projected images of three-dimensional objects. Obviously, only the position of projected atoms or atom columns may be obtained from a single image. Quantitative atomic resolution TEM allows materials scientists to measure structure parameters, including the positions of projected atoms or atom columns, from the obtained observations. The observations fluctuate about their expectations. The physical model describing these expectations, the expectation model, contains the structure parameters to be measured. Quantitative atomic resolution TEM makes use of such a model combined with statistical parameter estimation techniques in order to measure, or more specifically, to estimate, the positions of projected atoms or atom columns. Subsequently, the positions of the atoms in three-dimensional space may be derived from combining the measurements of a set of projected images. Therefore,

4 4 VAN AERT ET AL. Figure 1. Scheme of an atomic resolution TEM experiment. The observations are twodimensional projected images of three-dimensional objects. The structure parameters of these objects are unknown. Quantitative atomic resolution TEM allows one to estimate these parameters from the observations. The precision of the estimates depends on the microscope settings. The optimal microscope settings result into the highest attainable precision. quantitative atomic resolution TEM is probably the most appropriate technique for very precise measurement of atom positions. The precision of the projected atom position or atom column position estimates is limited by the presence of the fluctuations in the observations. It depends on the microscope settings, such as the defocus and the aperture. In the literature, a particular choice of such settings is referred to as experimental design (Fedorov, 1972). The purpose of this article is to optimize the experimental design in terms of the attainable precision, under relevant physical constraints. These constraints are either the radiation sensitivity of the object or the specimen drift. Therefore, either the incident electron dose per square ångstrom (that is, the amount of electrons per square ångstrom that interact with the object during the experiment) or the recording time has to be kept subcritical. Of crucial importance in the optimization procedure is that the attainable precision can be adequately quantified (van den Bos, 1982; van den Bos and den Dekker, 2001). It is used as optimality criterion for quantitative evaluation of the evect of microscope settings on the precision. This evaluation procedure, which is called statistical experimental design, allows electron microscopists to derive the optimal statistical experimental design, that is, the experimental design resulting into the highest attainable precision. Strictly speaking, optimal statistical experimental design refers to the optimization of tunable settings,

5 QUANTITATIVE ATOMIC RESOLUTION TEM 5 such as the defocus, and not to fixed settings, such as the spherical aberration constant, which, in the absence of a so-called spherical aberration corrector, is a fixed property of the microscope. According to van den Bos (2002), the optimization of fixed settings by means of new instrumental developments could be called optimal instrumental design. However, since the optimization procedure is the same, this distinction in terminology will not be made in the remainder of this article. The application of statistical experimental design to quantitative atomic resolution TEM is, in the author s opinion, novel. In these considerations, subjective qualities of the electron microscope as an imaging instrument are no longer important. In a sense, it doesn t matter whether the produced images are good-looking or not. The electron microscope is considered to be a measuring instrument (van Aert, den Dekker, van den Bos, and van Dyck, 2002; van den Bos, 2002a). This means that the structure parameters, the projected atom positions or atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. For many years, it has been standard practice to interpret images visually or to compare images visually with computer simulations in order to determine the structure of an object. This will be called qualitative atomic resolution TEM. The optimality criteria used to evaluate the accompanied microscope designs are based on classical resolution criteria, such as Rayleigh s. However, these criteria are not suitable for quantitative atomic resolution TEM. Instead, the attainable statistical precision is the criterion of importance. A. Qualitative Atomic Resolution Transmission Electron Microscopy Up to recently, qualitative atomic resolution TEM was hampered by insuycient resolution of the electron microscope. A resolution of 1 Å would be required to visualize the individual projected atom columns of materials with columnar structures, such as perfect crystals or crystals containing defects in the structure, viewed along a main zone axis. Over the years, diverent methods have been developed to obtain 1 Å resolution. Examples of such methods are:. High-voltage electron microscopy. Correction of the spherical aberration in the electron microscope. High-angle annular dark-field scanning transmission electron microscopy. Focal-series reconstruction. OV-axis holography. Correction of the chromatic aberration in the electron microscope

6 6 VAN AERT ET AL. These methods improve the interpretability of the experimental images in terms of the structure. The former three do not require image processing techniques, whereas the latter three do. In high-voltage electron microscopy, the accelerating voltage of the electron microscope is increased up to 1 MV and beyond (Phillipp et al., 1994). It is used in conventional transmission electron microscopy (CTEM). In this mode, the object is illuminated with a parallel incident electron beam. If the object is thin, the directly interpretable resolution for CTEM is given by the so-called point resolution (O Keefe, 1992; Spence, 1988). In highvoltage electron microscopy, it is about equal to 1 Å. For comparison, in intermediate voltage electron microscopy, operating at an accelerating voltage of about 300 kv, it is 2 Å. However, the disadvantage of high voltage electron microscopy is the increase of the displacement damage to the object, that is, the displacement of the atoms in the object from their initial positions (Spence, 1999; Williams and Carter, 1996). Spherical aberration in the electron microscope is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using multipole lenses, Rose (1990) has developed a corrector which cancels spherical aberration out. Correction of the spherical aberration is applied to both CTEM (Haider et al., 1998) and scanning transmission electron microscopy (STEM) (Batson, Dellby, and Krivanek, 2002). In the STEM mode, an electron probe is formed, which scans in a raster over the object. At present, one of the main diyculties of the spherical aberration corrector is the complicated procedure for the alignment of the large number of electrostatic and magnetic optical elements (Spence, 1999). In high-angle annular dark-field scanning transmission electron microscopy (HAADF STEM), one of the STEM variants, mainly inelastically scattered electrons are detected. The elastically scattered electrons are eliminated from detection. Here, the directly interpretable resolution is enhanced (Nellist and Pennycook, 2000), although at the expense of a significant loss of imaging electrons. The latter three possibilities, focal-series reconstruction, ov-axis holography, and correction of the chromatic aberration in the electron microscope are used in CTEM mode. In CTEM, one has, apart from the point resolution, another resolution measure, the so-called information limit. The information limit represents the smallest detail that can be resolved by using image processing techniques. It is inversely proportional to the highest spatial frequency that is still transferred with appreciable intensityfrom the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O Keefe, 1992). In intermediate voltage electron microscopy, the information limit is usually smaller than the point resolution.

7 QUANTITATIVE ATOMIC RESOLUTION TEM 7 Focal-series reconstruction and ov-axis holography push the directly interpretable resolution down to the information limit. This is done by retrieving the exit wave, that is, the complex electron wave function at the exit plane of the object. Ideally, the exit wave is free from any imaging artifacts, which means that the visual interpretability of the reconstruction is enhanced considerably for thin objects when compared to the original experimental images. Today, the information limit of CTEM is slightly below 1 Å for electron microscopes equipped with a field emission gun as electron source (Spence, 1999), (Kisielowski, Hetherington, Wang, Kilaas, O Keefe, and Thust, 2001; O Keefe et al., 2001). The focal-series reconstruction method reconstructs the exit wave from a series of images collected at diverent defocus values (Coene et al., 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; Thust, Overwijk, Coene, and Lentzen, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck, and Coene, 1993). OV-axis holography (Lichte, 1991) is based on the original idea of Gabor (1948), where the exit wave is retrieved from the interference between the object wave and a reference wave. The dominant factor governing the information limit is generally the chromatic aberration. It results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron. By use of a chromatic aberration corrector (Reimer, 1984; Weißbäcker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999), the chromatic aberration and hence the information limit enhance. The chromatic aberration corrector is at the conceptual stage, while the monochromator is already used in practice. However, by use of a monochromator the enhancement of the information limit is reached at the expense of a loss of the incident electron dose. The qualitative methods presented in this section, nowadays result in a resolution of 1 Å. Other methods to obtain this resolution exist as well, but they will not be treated in this article. B. Quantitative Atomic Resolution Transmission Electron Microscopy One ångstrom resolution is convenient for atomic resolution, but insuycient for materials science of the future, which will require precision rather than resolution (Cahn, 2001). One is inclined to think that a precision of 0.01 Å requires a resolution of 0.01 Å, which is far beyond the present possibilities. However, resolution and precision are quite diverent things. On the one hand, resolution expresses the ability to visualize separately adjacent atom columns in an image. On the other hand, precision corresponds to the

8 8 VAN AERT ET AL. variance, or the square root of the variance, the standard deviation, with which structure parameters can be estimated. In this study, the most important parameters are the projected atom column positions since nanomaterials are usually crystals containing defects in their columnar structure. In order to attain 0.01 to 0.1 Å precision, quantitative atomic resolution TEM is needed. Its goal is to estimate structure parameters of an object as precisely as possible from the observations. Estimation of the structure parameters requires an expectation model of the observations. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts detected, for example, with a charged coupled device (CCD) camera. It describes, for instance, the expected number of electrons per pixel in the two-dimensional projected image of Figure 1. The expectation model is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. Nowadays, these processes are suyciently well understood to make the derivation of an expectation model possible and several commercial software packages for atomic resolution TEM image simulations are available (Kilaas and Gronsky, 1983; Stadelmann, 1987). The parameters of the expectation model are structure parameters as well as microscope settings, characterizing the object under study and the microscope, respectively. In the derivation of this model, the object is described by the assembly of electrostatic potentials of the constituting atoms. Since the electrostatic potential is known for each atom type, the structure parameters reduce to atom numbers, atom positions, object thickness, orientation of the object with respect to the incident electron beam, and the Debye-Waller factor, which accounts for vibrations of the atoms at a given temperature (Wang, 2001). Then, the exit wave, resulting from the electron-object interaction, can be derived. An allembracing solution for this exit wave has not yet been found. DiVerent routes to achieve this goal are currently investigated. Proposed solutions are given by, for example, the weak phase object (Buseck, Cowley and Eyring, 1988), the multislice (Cowley and Moodie, 1957), and the Bloch wave theory (Hirsch et al., 1965; Howie, 1970; Kambe, Lehmpfuhl, and Fujimoto, 1974). A remarkable solution is given by the channelling theory (Geuens and van Dyck, 2002; Howie, 1966; Op de Beeck and van Dyck, 1996; Pennycook and Jesson, 1991; Sinkler and Marks, 1999, van Dyck and Chen, 1999a; van Dyck et al., 1989). It requires advanced knowledge of quantum mechanics. The channelling theory proposes a solution for the exit wave, which is simple, albeit approximate, but which is in closed analytical form so that it has the advantage that the projected structure of the object may relatively easily be obtained from this solution. The theory is applicable if the object is oriented along a main zone axis. In this orientation, the atoms

9 QUANTITATIVE ATOMIC RESOLUTION TEM 9 are superimposed along a column, hence the name atom column. It can then be shown that the electrons are trapped in the positive potential of these columns. Each atom column, in a sense, acts as a channel for the electrons. If the distance between adjacent columns is not too small, a one-to-one correspondence between the exit wave and the object structure is established. From the channelling theory an analytical expression for the exit wave can be derived, which is parametric in the projected atom column positions, the atom numbers of the atoms along a column, the distance between successive atoms along a column, and the Debye-Waller factor (van Dyck and Chen, 1999a). As already mentioned, one may expect to obtain projected information only. Ambiguity about the types and distance of atoms along a column may only be removed by combining information from diverent zone axis orientations (van Dyck and Chen, 1999b). Furthermore, the transfer in the microscope and the image detection, which are also described by the expectation model, are characterized by a collection of microscope settings, such as the defocus value, the spherical aberration constant of the objective lens, the accelerating voltage, and the pixel size of the camera. The structure parameters or microscope settings of the expectation model are either known beforehand with suycient accuracy and precision or not, in which case they have to be estimated from the experiment by means of statistical parameter estimation techniques. This is done by adapting the expectation model to the experimentally obtained observations with respect to the unknown parameters using a criterion of goodness of fit, such as the least squares sum or the likelihood function (Saxton, 1997). The set of parameters for which this criterion is optimum corresponds to the estimates. In a sense, in quantitative atomic resolution TEM, one is looking for the optimal value of a criterion in a parameter space whose dimension is equal to the number of parameters to be estimated. This search for the global optimum of the criterion of goodness of fit is an iterative numerical optimization method (Möbus et al., 1997). An overview of such methods can be found in Murray (1972) and van den Bos (1982). Generally, the dimension of the parameter space is high. Consequently, it is quite possible that the optimization procedure ends up at a local optimum instead of at the global optimum of the criterion of goodness of fit, so that the wrong structure is suggested. To solve this dimensionality problem, that is, to find a pathway to the global optimum in the parameter space, a good starting structure is required (van Dyck et al., 2003). Finding such a starting structure is not trivial, since due to two scrambling processes, details in the images do not necessarily correspond to features in the atomic structure. The first scrambling process is the dynamic scattering of the electrons on their way through the object. The second scrambling process is the transfer

10 10 VAN AERT ET AL. in the electron microscope. Imaging lenses are not perfect, but have aberrations, such as spherical and chromatic aberration. As a consequence, the structure information of the object may be strongly delocalized. Additionally, the images are always disturbed by noise, that is, fluctuations in the observations, which further complicates direct interpretation. However, it has been shown that good starting structures can be found by using the qualitative methods described before. For example, focal-series reconstruction methods in a sense invert, or equivalently, undo, the evect of lens aberrations. Consequently, the thus obtained exit wave is much more related to the object structure, providing a directly interpretable resolution close to the information limit, which just surpasses the limit beyond which individual atom columns can be discriminated (Kisielowski, Hetherington, Wang, Kilaas, O Keefe and Thust, 2001; Thust and Jia, 2000; Zandbergen and van Dyck, 2000). Focal-series reconstruction methods thus yield an approximate structure that may be used as a starting point in a final numerical optimization procedure by adapting the expectation model to the original observations. The starting structure obtained may still be insuyciently close to the global optimum of the criterion of goodness of fit to guarantee convergence. In order to find a better starting structure, one also has to undo the first scrambling process mentioned, that is, the dynamic scattering of the electrons on their way through the object. Undoing the dynamic scattering is possible by means of the channelling theory. Adapting the analytical expression for the exit wave to the reconstructed exit wave with respect to the structure parameters provides the experimenter with an approximate structure that can then be used as an improved starting point for a final numerical optimization procedure by adapting the expectation model to the original images. C. Statistical Experimental Design As mentioned before, with the resolution becoming suycient to discriminate individual atom columns, a structure is char-acterized completely by the atom column positions, the atom numbers, the distance between successive atoms along a column, the object thickness, and orientation. Then, quantitative structure determination by means of quantitative atomic resolution TEM is a statistical parameter estimation problem, the image pixel values being the observations from which the parameters of interest have to be estimated. The precision with which these parameters can be estimated is only limited by the presence of noise. In this article, it will be shown that estimation of the unknown parameters may result in higher precisions if it is accompanied by statistical experimental design.

11 QUANTITATIVE ATOMIC RESOLUTION TEM 11 The procedure to derive the optimal statistical experimental design is as follows. Due to the inevitable presence of noise, the observations will always fluctuate randomly and are therefore modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function defines the expectations, that is, the mean value of each observation, as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, which is parametric in the quantities to be estimated. Given the joint probability density function, use of the concept of Fisher information allows one to determine the attainable precision, that is, the lowest possible variance, with which a parameter can be estimated unbiasedly from a set of observations assumed to obey a certain distribution (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001). Thus, it is possible to derive an expression for the lower bound on the variance with which the atom column positions can be estimated from a quantitative atomic resolution TEM experiment. This lower bound, which is called the Cramér-Rao Lower Bound (CRLB), is independent of the estimation method used, and therefore represents the intrinsic limit on precision. Moreover, it is a function of the microscope settings. This means that the CRLB varies with the microscope settings, of which at least some are adjustable. The optimal statistical experimental design of an atomic resolution TEM experiment is then given by the microscope settings that correspond to the lowest CRLB (van Aert, den Dekker, van den Bos and van Dyck, 2002b; van Dyck et al., 2002). It is found by minimizing the CRLB with respect to the microscope settings, under the existing physical constraints, which are the radiation sensitivity of the object or the specimen drift. Notice, that the optimal statistical experimental design may be diverent for diverent objects under investigation. In this article, the use of statistical experimental design for quantitative atomic resolution TEM is demonstrated. To begin with, it is applied to CTEM, STEM, and electron tomography experiments, all described in a simplified way (van Aert, den Dekker, van Dyck and van den Bos, 2002a). The attainable precision with which position and distance parameters of one or two components can be estimated has been investigated. For CTEM and STEM, the components are two-dimensional and the observations are counting events in a two-dimensional pixel array, whereas for electron tomography, the components are three-dimensional and the observations are counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. The expectation models of the observations are assumed to be Gaussian peaks, although they are of a higher complexity in practice. Under this assumption,

12 12 VAN AERT ET AL. the CRLB on the variance with which position and distance parameters of one or two components can be estimated, which is usually calculated numerically, may be given in closed analytical form. Although a simplified model has been used for the derivation of these expressions, they are very useful as rules of thumb to give insight into statistical experimental design for quantitative atomic resolution TEM. The rules of thumb clearly show the dependence of the attainable precision on the width of the point spread function, the width of the components, and the number of detected counts. For electron tomography, the attainable precision also depends on the orientation of the components with respect to the rotation axis. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower probe may be accompanied by a decrease of the number of detected electrons, both evects have to be weighed against each other under the existing physical constraints. The optimal statistical experimental designs of CTEM and STEM experiments, assuming more complicated, physics based expectation models, instead of Gaussian peaks, are derived as well. These results are derived from the numerical minimization of the CRLB with respect to the microscope settings. The thus obtained results are intuitively interpreted using the rules of thumb for the CRLB, which are derived from Gaussian peaked expectation models. First, for CTEM operating at an intermediate accelerating voltage of about 300 kv, it is shown that a spherical or a chromatic aberration corrector may improve the attainable precision. However, the gain, which depends on the object under study, usually turns out to be disappointing. Furthermore, a monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift (den Dekker et al., 2001), whereas it may slightly improve the precision if the experiment is limited by the radiation sensitivity of the object. For CTEM operating at a low accelerating voltage of about 50 kv, the attainable precision improves substantially by using both a spherical aberration corrector and either a chromatic aberration corrector or a monochromator. Next, for STEM, it is shown that the optimal probe is not the narrowest possible and that its optimal width strongly depends on the object under study. Moreover, an

13 QUANTITATIVE ATOMIC RESOLUTION TEM 13 annular detector usually results in a higher attainable precision than an axial one. Furthermore, as for CTEM, the precision that is gained using a spherical aberration corrector depends on the object under study, but this gain is generally only marginal (den Dekker, van Aert, van Dyck, and van den Bos, 2000; van Aert and van Dyck, 2001; van Aert, den Dekker, van Dyck, and van den Bos, 2000, 2002b). Also, it is shown that for both CTEM and STEM, the reduced brightness of the electron source is preferably as high as possible and the specimen holder as stable as possible, especially if the experiment is limited by specimen drift. The outline of the article is as follows. Section II introduces the basic principles of statistical experimental design. The attainable precision is proposed as quantitative performance measure. It allows one to evaluate, to optimize, and to compare diverent experimental settings. In Section III, this process is illustrated for the estimation of position and distance parameters from CTEM, STEM, and electron tomography experiments, which are all described by simplified expectation models. In Section IV, the optimal statistical experimental design of CTEM experiments is derived from more complicated, physics based expectation models. Special attention is paid to the spherical aberration corrector, the chromatic aberration corrector, and the monochromator. In Section V, the optimal statistical experimental design of STEM experiments is discussed. In particular, the optimal probe and detector configuration are determined. In Section VI, conclusions are drawn. II. Basic Principles of Statistical Experimental Design A. Introduction In this section, the basic principles of statistical experimental design will be introduced. These principles may be applied to set up experiments in many branches of science, from elementary particle physics to astronomy. In these experiments, the measurement of any unknown parameter, such as the position of a star, the concentration of chemical elements, or the decay constant in a radio-active decay process, always takes place in the presence of fluctuations in the observations. As a result of these fluctuations, the precision with which the parameters can be measured is limited. The purpose of statistical experimental design is to derive the experimental design, that is, a particular choice of experimental settings, resulting in the highest precision. This so-called optimal statistical experimental design can be derived by applying the apparatus of mathematical statistics

14 14 VAN AERT ET AL. straightforwardly. Hence, statistical experimental design is a powerful method, which can replace conventional methods that were, or, still are, used to optimize the experimental design. These conventional methods are based on the intuition of the experimenter. However, intuition might be very misleading, especially in combination with the increasing complexity of today s experiments. Instead, statistical experimental design is needed. In the remainder of this article, it will be used to optimize the experimental design of quantitative atomic resolution TEM experiments in terms of the precision with which the atom positions can be estimated. To begin with, a simple definition of an experiment must be given. In principle, it can be defined as the way of collecting and analyzing a set of observations for a given purpose. From statistician s point of view, this purpose is to measure unknown parameters as precisely as possible. This allows the experimenter to draw reliable conclusions from his or her experiment. The vital importance of precise measurement as a path to understanding was already recognized in 1883 by William Thomson, Lord Kelvin, the famous Scottish physicist. One of his much-quoted utterances in a lecture to civil engineers in London is the following one (Cahn, 2001): I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your own thoughts, advanced to the state of science. William Thomson, Lord Kelvin In order to be able to measure unknown parameters as precisely as possible, the analysis of the observations is based on the use of parameter estimation techniques. In other words, an estimator, which estimates the parameters from the observations, is chosen. Since an estimator is a function of the observations, the precision of the chosen estimator depends on the way the observations are collected. Often, these observations may be collected under a large variety of experimental designs. Given the purpose of an experiment, the optimal experimental design is given by the experimental settings resulting in the highest precision of the unknown parameters. The definition of an experiment may be illustrated at the hand of an example. A quantitative atomic resolution TEM experiment may be regarded as a set of observations, that is, electron counting results made, for example, with a CCD camera, from which the structure of the object under study, the atom positions in particular, has to be estimated as precisely as possible. In TEM, these observations may be collected by choosing, for example, defocus and aperture, and by choosing between diverent imaging modes, such as conventional transmission electron CTEM and STEM. This overs electron

15 QUANTITATIVE ATOMIC RESOLUTION TEM 15 microscopists the possibility to choose the electron microscope settings in accordance with the optimal experimental design so as to estimate unknown parameters as precisely as possible. The optimization of the experimental design consists of diverent steps. First, a parametric statistical model of the observations has to be chosen. Since the observations fluctuate randomly about their expectations, due to the inevitable presence of noise, they are modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function of the observations defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, that is, a physical model containing the parameters to be estimated. For example, in a radioactive decay process, the expectation model is a multi-exponential function, where the parameters are the decay constants. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts. It is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. The parameters of the expectation model are, for example, the projected atom or atom column positions, the object thickness, and the atom numbers. Usually, this kind of parameters has a clear physical meaning. Hence, the specification of the parametric statistical model of the observations needs a solid physical base. Second, the optimality criterion that will be used to optimize the experimental design has to be specified. The choice of this criterion depends on the purpose of the experiment, which is to estimate the unknown parameters of the expectation model as precisely as possible. Hence the optimality criterion to be preferred is the precision of the parameter estimates. Therefore, the precision has to be adequately quantified. This can be done using statistical parameter estimation theory. From the parametric statistical model of the observations, the attainable statistical precision can be determined, that is, the lower bound on the variance with which the parameters can be estimated without bias from the observations (van den Bos, 1982; van den Bos and den Dekker, 2001). The meaning of this socalled CRLB is as follows. Generally, one may use diverent estimators in order to estimate parameters. An estimator is a function of the observations that is used to compute the parameters. Thus, an estimator is, like the observations, a stochastic variable. It is said to be unbiased if its expectation is equal to the true value of the parameter. Stated diverently, an unbiased estimator has no systematic error. Moreover, diverent estimators will have diverent precisions. The precision of an estimator is represented by its variance or by its standard deviation, which is the square root of the

16 16 VAN AERT ET AL. variance. It can be shown that the variance of unbiased estimators will never be lower than the CRLB. There exists a class of estimators, including the maximum likelihood estimator, that achieves the CRLB asymptotically, that is, for an increasing number of observations. The existence of the maximum likelihood estimator justifies the choice of the CRLB as optimality criterion. The CRLB is a function of the experimental settings. Thus, the lower bound on the variance of each individual, unknown parameter of the expectation model could be computed and minimized as a function of the experimental settings. However, simultaneous minimization of the set of lower bounds corresponding to the entire set of unknown parameters is usually impossible. Therefore, statistical parameter estimation theory provides diverent optimality criteria, which are functions of the set of lower bounds. These are scalar measures and the experimenter has to choose one of them or has to produce a criterion him or herself, reflecting his or her specific purpose. For an electron microscopist, a specific purpose might be to measure the atom column positions as precisely as possible, irrespective of the precision of the object thickness or of the atom numbers. Thus, a possible optimality criterion is the sum of the lower bounds on the variance of the position coordinates. Generally, the choice of the optimality criterion requires detailed knowledge from experts in the scientific field. Finally, the optimality criterion chosen has to be optimized with respect to the experimental settings. This produces the optimal statistical experimental design. Usually, this is a nonlinear optimization problem for which the optimal value of the criterion has to be found numerically. This optimization is subject to the relevant physical constraints. For atomic resolution TEM, these constraints are the radiation sensitivity of the object under study or the specimen drift. Therefore, the incident electron dose per square ångstrom or the recording time has to be kept within the constraints. So far the introduction to the basic principles of statistical experimental design. For an extended introduction to statistical experimental design and the diverent steps encountered for the optimization, the reader is referred to Fedorov (1972) and Pázman (1986). The section is organized as follows. In Section II.B, parametric statistical models of observations will be discussed. In Section II.C, it will be shown how an adequate expression for the attainable statistical precision of the parameter estimates, that is, the CRLB, can be derived from such a parametric statistical model. The presented optimality criteria are functions of the attainable precisions. In Section II.D, the maximum likelihood estimator of the parameters will be derived from the parametric statistical model of observations. This estimator attains the CRLB asymptotically and, hence, justifies the choice of the optimality criteria. Section II.E consists of conclusions.

17 QUANTITATIVE ATOMIC RESOLUTION TEM 17 B. Parametric Statistical Models of Observations In this section, parametric statistical models of observations will be introduced. Specifically, they will be used to model electron microscopical observations. Any experimenter will readily admit that his or her observations contain errors. With a view to statistical experimental design, these errors must be specified. Generally, due to the inevitable presence of noise, sets of observations made under the same conditions nevertheless diver from experiment to experiment. The usual way to describe this behaviour is to model the observations as stochastic variables. The reason is that there is no viable alternative and that it has been found to work (van den Bos, 1999; van den Bos and den Dekker, 2001). By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. Consider a set of stochastic observations w m ; m ¼ 1;...; M made at the measurement points x 1,..., x M. These measurement points are assumed to be exactly known. In CTEM, the observations are, for example, electron counting results made, for example, at the pixels of a CCD camera, where M represents the total number of pixels. Then, the M 1 vector w defined as w ¼ ðw 1...w M Þ T ð1þ is the column vector of these observations. It represents a point in the Euclidean M space having w 1,..., w M as coordinates. This will be called space of observations (van den Bos and den Dekker, 2001). The expectations of the observations, that is, the mean values of the observations, are defined by their probability density function. The vector of expectations Ew ½ Š ¼ ðew ½ 1 Š...Ew ½ M ŠÞ T ð2þ is also a point in the space of observations and the observations are distributed about this point. The symbol E [.] denotes the expectation operator. In this article, the expectations of the observations are described by the expectation model, that is, a physical model, which contains the unknown parameters to be estimated, such as the position coordinates of the projected atoms or atom columns. The unknown parameters are represented by the T 1 parameter vector y ¼ðy 1...y T Þ T. Thus, it is supposed that the expectation of the mth observation is described by Ew ½ m Š ¼ f m ðyþ ¼ fðx m ; yþ; ð3þ where f m (y) represents the expectation model, which is evaluated at the measurement point x m and which depends on the parameter vector y. Apart

18 18 VAN AERT ET AL. from the unknown parameters y, the expectation model contains known parameters and experimental settings as well. For example, in quantitative atomic resolution TEM, the expectation model is sometimes described as f kl ðþ¼ y N jcðr kl ; yþtðr kl ; "; C s Þj 2 ; ð4þ I norm where N represents the total number of detected electrons in an image, the function c(r kl ; y) describes an object consisting of n c atom columns with r kl ¼ðx k y l Þ T the position of the pixel (k, l ) and with the parameter vector y ¼ðb x1...b xnc b y1...b ync Þ T containing the positions of the atom columns, t(r; ", C s ) represents the point spread function of the electron microscope depending on microscope settings such as the spherical aberration constant C s and the defocus ", and I norm represents a normalization factor so that the integral of the function jcðr kl ; yþtðr kl ; "; C s Þj 2 =I norm is equal to one. Models like Eq. (4) will be derived and explained in detail in the remainder of this article. Electron microscopical observations are electron counting results detected, for example, with a CCD camera. Under the assumption that the quantum eyciency of this detector is large enough to detect single electrons, these observations are binomially distributed. This means that the probability that the observation w m is equal to o m is given by (Papoulis, 1965)! N o m p o m m ð 1 p mþ N o m ð5þ with N the total number of detected electrons, p m the probability that a single electron hits the pixel at the position x m,and! N N! ¼ ð6þ o m o m! ðn o m Þ! For large N and p m 1, which is a useful approximation for electron microscopical observations, the binomial distribution tends to a Poisson distribution (Bevington, 1969). Therefore, the probability that the observation w m is equal to o m is given by (Papoulis, 1965) l o m m o m! exp ð l mþ; ð7þ where the parameter l m ¼ Np m is equal to the expectation of the observation w m, which in its turn, is described by the expectation model, given by Eq. (3):

19 QUANTITATIVE ATOMIC RESOLUTION TEM 19 Ew ½ m Š ¼ l m ¼ f m ðyþ: ð8þ The assumption that the observations are Poisson distributed is usually made in electron microscopy (see, for example, (Herrmann, 1997)). A property of the Poisson distribution is that the variance of the observation w m is equal to l m : varðw m Þ ¼ l m : ð9þ Moreover, electron microscopical observations may be assumed to be statistically independent. Therefore, the probability P(o; y) that a set of observations w ¼ðw 1...w M Þ T is equal to o ¼ðo 1...o M Þ T is equal to the product of all probabilities described by Eq. (7): Pðo; yþ ¼ YM m¼1 l o m m o m! exp ð l mþ ð10þ This function is called the joint probability density function of the observations. It represents the parametric statistical model of the observations. The parameters y to be estimated enter P(o; y) via l m. In Section II.C, the parameterized joint probability density function will be used to derive the CRLB, that is, an expression for the attainable precision with which the unknown parameters can be estimated unbiasedly from the observations. The presented optimality criteria, which may be used for the optimization of the experimental design, are functions of the attainable precisions. In Section II.D, from the joint probability density function, the maximum likelihood estimator of the parameters is derived. This estimator actually achieves the CRLB asymptotically, that is, for the number of observations going to infinity. C. Attainable Precision In this section, it will first be shown how the joint probability density function can be used to determine the attainable precision, that is, the CRLB, which is a lower bound on the variance of any unbiased estimator. The CRLB is independent of any particular method of estimation. Next, optimality criteria, which are functions of the CRLB, are given. The CRLB depends on experimental settings, the design. Hence, functions of the CRLB, such as the optimality criteria, also depend on the experimental settings. This means that they vary with the experimental settings, of which at least some are adjustable. The experimenter has to choose one of these criteria, depending on his or her purpose, and optimize it to find the corresponding optimal design.

20 20 VAN AERT ET AL. 1. The Cramér-Rao Lower Bound In this section, the parameterized probability density function of the observations, which is derived in Section II.B, will be used to define the Fisher information matrix and to compute the CRLB on the variance of unbiased estimators of the parameters of the expectation model. The CRLB will also be extended to include unbiased estimators of vectors of functions of these parameters. The reader is referred to (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001) to find the details of the CRLB. First, the Fisher information matrix F with respect to the elements of the T 1 parameter vector y ¼ðy 1...y T Þ T is introduced. It is defined as the T T matrix F ¼ ln T ; ð11þ where P(o; y) is the joint probability density function of the observations w ¼ðw 1...w M Þ T. The expression between square brackets represents the Hessian matrix of ln P, for which the (r, s)th element is defined 2 ln P(o; y)/@y s. For electron microscopical observations, where P(o; y) is given by Eq. (10), it follows from Eqs. (8), (10), and (11) that the (r, s)th element of F is equal to: F rs ¼ XM m¼1 m : l s ð12þ Next, it can be shown that the covariance matrix cov(ŷ) of any unbiased estimator ŷ of y satisfies: cov ŷ F 1 ð13þ This inequality expresses that the diverence of the matrices cov(ŷ) and F 1 is positive semidefinite. Since the diagonal elements of cov(ŷ) represent the variances of ŷ1;...; ŷt and since the diagonal elements of a positive semidefinite matrix are nonnegative, these variances are larger than or equal to the corresponding diagonal elements of F 1 : var ŷr F 1 ð14þ where r ¼ 1;...; T and [F 1 ] rr is the (r, r)th element of the inverse of the Fisher information matrix. In this sense, F 1 represents a lower bound to the variances of all unbiased ŷ. The matrix F 1 is called the CRLB on the variance of ŷ. rr ;

21 QUANTITATIVE ATOMIC RESOLUTION TEM 21 Finally, the CRLB can be extended to include unbiased estimators of vectors of functions of the parameters instead of the parameters proper. Let gðyþ ¼ðg 1 ðyþ...g C ðyþþ T be such a vector and let ĝ be an unbiased estimator of g(y). Then, it can be shown that covðþ F T is the C T Jacobian matrix defined by its (r, s)th r /@y s (van den Bos, 1982). The right-hand member of this inequality is the CRLB on the variance of ĝ. It should be noticed that the CRLB may only be computed if the probability density function of the observations is known. At first sight, this seems to be a problem since the true parameters of the probability density function are unknown. Nevertheless, even if the CRLB is a function of the unknown parameters, it remains an extremely useful tool. For nominal values of the unknown parameters it enables one to quantify variances that might be achieved, to detect possibly strong covariances between parameter estimates and, as will be shown in this article, to optimize the experimental design (van den Bos, 1982). Moreover, the estimates obtained using an estimator that achieves the CRLB may be substituted for the true parameters in the expression for the CRLB so as to get a level of confidence to be attached to these estimates (den Dekker and van Aert, 2002). In this section, it has been shown how from the joint probability density function, which is described in Section II.B, the elements of the Fisher information matrix may be calculated explicitly. From the latter, the CRLB on the variance of the parameters of the expectation model and on the variance of functions of these parameters may be computed from the righthand member of Eq. (13) and (15), respectively. The diagonal elements of the CRLB give a lower bound on the variance of any unbiased estimator of the parameters. Since the joint probability density function is a function of the experimental settings, the CRLB is a function of these settings as well. Therefore, the CRLB may be used to evaluate and to optimize the experimental design in terms of the precision. However, simultaneous minimization of the diagonal elements of the CRLB, that is, the right-hand members of Eq. (14), is usually impossible. Therefore, statistical parameter estimation theory provides diverent optimality criteria, which are functions of the elements of the CRLB. These are scalar measures. The experimenter may choose one of these provided criteria or may produce a criterion him or herself, reflecting his or her purpose. A selection of criteria, which are provided in the literature, are given in the following section.

22 22 VAN AERT ET AL. 2. Precision Based Optimality Criteria In this section, optimality criteria that may be used for the evaluation and optimization of the experimental design are discussed. These criteria are functions of the CRLB and depend, like the CRLB, on the experimental settings. Several criteria are found in the literature (Fedorov, 1972; Pázman, 1986). A selection of them is discussed here. A distinction between global and partial, or, equivalently, truncated, optimality criteria is made. Global criteria are used when all parameters, represented by the elements of the parameter vector y, are important. Partial or truncated criteria are used when only some parameters or some functions of the parameters are important. For atomic resolution TEM, partial criteria are needed if the electron microscopist is only interested in, for example, the positions of atom columns, the positions of light or heavy atom columns, the distance between particular atom columns, or the positions of the atoms of a certain atom type, whereas he or she is not so interested in the object thickness or the atom numbers. Examples of both types of criteria are given below. a. Global Optimality Criteria. A-optimality criterion. The A-optimality criterion is defined by the sum of the diagonal elements of the CRLB, that is, the trace of the CRLB: tr F 1 : ð16þ This criterion may be interpreted under the assumption that there exists an estimator with covariance matrix equal to the CRLB. Then, minimizing the A-optimality criterion corresponds to minimizing the sum of the variances of the estimates ŷ1,..., ŷt of the parameters y 1,... y T, without taking the correlation between these estimates into account. A geometric interpretation of this criterion may be given by considering the ellipsoid of concentration, which is a measure of the concentration of the distribution of the estimates about the true parameters. It is defined by the ellipsoid enclosing the true parameters y such that, a uniform distribution over the area bounded by the ellipsoid will have the same expectation and covariance matrix as the distribution of the estimates (Cramér, 1999; Mood, Graybill, and Boes, 1974). In Figure 2, the square root of the A-optimality criterion, (tr F 1 ) 1/2, is shown on the ellipsoid of concentration for the special case of two unknown parameters. This figure is based on Fedorov (1972).. D-optimality criterion. The D-optimality criterion is defined by the determinant of the CRLB:

23 QUANTITATIVE ATOMIC RESOLUTION TEM 23 det F 1 : ð17þ A statistical interpretation of the D-optimality criterion may be given for the hypothetical estimator discussed before. Then, minimizing the D-optimality criterion corresponds to minimizing the volume of the ellipsoid of concentration, which is shown in Figure 2 for the special case of two parameters. The drawback of minimizing the D-optimality criterion is that in some cases the volume of the ellipsoid of concentration is small because it is narrow but long. This means that there is a linear combination of the parameters which is estimated with a very large variance under the corresponding optimal design.. Minimax criterion in space of parameters. The minimax criterion in the space of parameters is defined by the maximum value of the diagonal elements of the CRLB: max r F 1 rr : ð18þ Minimizing this criterion corresponds to minimizing the largest variance of the estimate of the corresponding parameter. For example, in Figure 2, the square root of the criterion, given by Eq. (18), corresponds to ½F 1 Š 1=2 11. Figure 2. Ellipsoid of concentration for two parameters. The geometric interpretation of the square root of the A-optimality criterion, minimax criterion in space of parameters, and E-optimality criterion is represented by (trf 1 Þ 1=2 ; ½F 1 Š 1=2 11, and 1=l1=2 min, respectively. Minimizing the D-optimality criterion corresponds to minimizing the volume (the area in this example) of the ellipsoid of concentration. This figure is based on (Fedorov, 1972).

24 24 VAN AERT ET AL.. E-optimality criterion. The E-optimality criterion is defined by the inverse of the minimum eigenvalue l min of the Fisher information matrix: 1 : ð19þ l min In Figure 2, the square root of the E-optimality criterion is shown on the ellipsoid of concentration for the special case of two parameters.. Linear optimality criteria. Linear optimality criteria are defined by criteria functions of the form tr WF 1 ; ð20þ where W is a positive definite T T matrix. The A-optimality criterion corresponds to the particular case where W is equal to the identity matrix. If W is a diagonal matrix, Eq. (20) is equal to: X T r¼1 W rr F 1 rr ; that is, a weighted sum of the variances. ð21þ b. Partial or Truncated Optimality Criteria. In principle, partial or truncated optimality criteria are analogous to global optimality criteria, but instead of the full CRLB that is, F 1, only a submatrix FS 1 of F 1 is used. If only y 1,..., y S of the entire collection of T unknown parameters y 1,...y T are important, the submatrix to be used is defined as: 0 1 ½F 1 Š 11 ½F 1 Š ½F 1 Š 1S ½F 1 Š FS 1 21 ½F 1 Š ½F 1 Š 2S ¼ B C A ½F 1 Š S1 ½F 1 Š S2... ½F 1 Then, for example, the partial D-optimality criterion is defined by the determinant of FS 1. Moreover, if only some functions of the parameters are important, the inverse of the right-hand member of inequality (15) has to be used. The optimality criteria, which are presented in this section, are functions of the elements of the CRLB. Minimization of these criteria as a function of the experimental settings, under the relevant physical constraints, produces the optimal statistical experimental design. However, diverent optimality criteria will generally produce diverent optimal designs. The experimenter Š SS

25 QUANTITATIVE ATOMIC RESOLUTION TEM 25 has to choose one of them or has to produce a criterion him or herself depending on his or her purpose. D. Maximum Likelihood Estimation In this section, it is discussed how the maximum likelihood estimator of the parameters may be derived from the parameterized probability density function, which is discussed in Section II.B. This estimator is very important since it achieves the CRLB asymptotically, that is, for the number of observations going to infinity. Thus, it is asymptotically most precise and is therefore often used in practice. The maximum likelihood estimator is clearly discussed in (van den Bos and den Dekker, 2001). A summary is given here. The maximum likelihood method for estimation of the parameters consists of three steps: 1. The available observations w ¼ðw 1...w M Þ T are substituted for the corresponding independent variables o ¼ðo 1...o M Þ T in the probability density function, for example, in Eq. (10). Since the observations are numbers, the resulting expression depends only on the elements of the parameter vector y ¼ðy 1...y T Þ T. 2. The elements of y ¼ðy 1...y T Þ T, which are the hypothetical true parameters, are considered to be variables. To express this, they are replaced by t ¼ðt 1...t T Þ T. The logarithm of the resulting function, ln P(w; t), is called the log-likelihood function of the parameters t for the observations w, which is denoted as q(w; t). 3. The maximum likelihood estimates ŷml of the parameters y are defined by the values of the elements of t that maximize q(w; t), or ŷ ML ¼ arg max t qw; ð tþ ð23þ The most important properties of the maximum likelihood estimator are the following ones:. Consistency. Generally, an estimator is said to be consistent if the probability that an estimate deviates more than a specified amount from the true value of the parameter can be made arbitrarily small by increasing the number of observations used.. Asymptotic normality. If the number of observations increases, the probability density function of a maximum likelihood estimator tends to a normal distribution.

26 26 VAN AERT ET AL.. Asymptotic eyciency. The asymptotic covariance matrix of a maximum likelihood estimator is equal to the CRLB. In this sense, the maximum likelihood estimator is most precise.. Invariance property. The maximum likelihood estimates ĝ ML of a vector of functions of the parameters y, that is, gðyþ ¼ðg 1 ðyþ...g C ðyþþ T, are equal to gðŷmlþ ¼ðg 1 ðŷmlþ...g C ðŷmlþþ T (Mood, Graybill and Boes, 1974). In the remainder of this article, it will be checked if the maximum likelihood estimator attains the CRLB for atomic resolution TEM experiments. If so, the use of the optimality criteria given in Section II.C.2, which are functions of the elements of the CRLB, is justified. E. Conclusions In this section, it has been shown how to evaluate and to optimize the experimental design in terms of the precision with which unknown parameters can be estimated. The optimization consists of diverent steps, which may be summarized as follows: 1. The parametric statistical model of the observations is derived. This model defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The specification of this model requires a solid physical base. 2. The CRLB, which is a theoretical lower bound on the variance of the parameter estimates, is computed from the parametric statistical model of the observations. This lower bound represents the highest attainable precision. Since the parametric statistical model of the observations is a function of the experimental settings, the CRLB is a function of these settings as well. 3. An optimality criterion is chosen, reflecting the purpose of the experimenter. This criterion is a function of the elements of the CRLB, which, like the CRLB, depends on the experimental settings. Generally, diverent optimality criteria will produce diverent optimal experimental designs. 4. The criterion chosen is optimized with respect to the experimental settings. The settings corresponding to the optimum are suggested as the optimal statistical experimental design. This optimization procedure is subject to the physical constraints. In the remainder of this article, this procedure will be applied to set up quantitative atomic resolution TEM experiments.

27 QUANTITATIVE ATOMIC RESOLUTION TEM 27 III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models A. Introduction In this section, the attainable precision with which position and distance parameters of one or two components can be estimated, is computed for atomic resolution TEM experiments described by simplified models. In other words, an expression for the CRLB on the variance of position and distance estimates, which has been introduced in Section II, is derived for one-, two-, and three-dimensional components. Such expressions may be used to evaluate and optimize the experimental designs. For one- and two-dimensional components, the observations consist of counting events in a one- and twodimensional pixel array, respectively. For three-dimensional components, they consist of counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. In principle, these examples may be considered as simulations of a wide variety of experiments. However, in the remainder of this article, the two-dimensional example will be regarded as a simplified simulation of a high-resolution CTEM or STEM experiment, whereas the three-dimensional example will be regarded as a simplified simulation of an electron tomography experiment. Usually, the performance of such experiments is discussed in terms of twopoint resolution, expressing the possibility of perceiving separately components of a two-point image. One of the earliest and most famous criteria for two-point resolution is that of Rayleigh (1902). Criteria such as Rayleigh s are suitable to set up qualitative atomic resolution TEM experiments. However, as already mentioned in Section I, a diverent optimality criterion is needed in the framework of quantitative atomic resolution TEM, where one has prior knowledge about the observations in the form of a parametric statistical model, describing the expectations of the observations as well as the fluctuations of the observations about these expectations. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In this section, the model describing the expectations of the observations, the expectation model, is assumed to consist of Gaussian peaks with unknown position. Under this assumption, it will be shown that the CRLB, which is usually calculated numerically, may be approximated by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity than a Gaussian peak, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. This will be

28 28 VAN AERT ET AL. shown in the remainder of this article, where more complicated, physics based expectation models will be considered and where, consequently, the CRLB has to be calculated numerically. In the absence of rules of thumb for the attainable precision, it would be diycult, if not impossible, to understand these numerical results. In the author s opinion, whenever possible, every numerical analysis should be preceded by a simplified analysis. This will provide a check of the numerical results. In Section III.B, parametric statistical models of the observations are described. In Section III.C, the approximations of the CRLB, that is, the rules of thumb for the CRLB, are derived from these models. Section III.D consists of discussions and examples. In Section III.E, conclusions are drawn. Part of the results of this section has earlier been published in (van Aert, den Dekker, van Dyck, and van den Bos, 2002a). B. Parametric Statistical Models of Observations In this section, the pertinent parametric statistical models of the observations are described. In the remainder of this section, these models will be used for the derivation of expressions for the CRLB with which the position of one component or the distance between two components can be measured. The purpose is to find rules of thumb for the CRLB, that is, expressions that are easy to calculate and to interpret. In order to accomplish this, it will be assumed that the expectation models underlying the observations consist of Gaussian peaks with unknown position and known amplitude and width. In Sections III. B. 1, 2, and 3, the expectation model is described for one-, two-, and three-dimensional observations, respectively. 1. One-Dimensional Observations For one-dimensional observations, the normalized image intensity distribution is assumed to be given by: fðx; bþ ¼ 1 X n c n c n¼1 Fx ð b xn Þ; ð24þ where n c is the total number of components, b xn is the position of the nth component, and Fx ð Þ ¼ ffiffiffiffiffi 1 p exp x2 ð25þ 2p r 2r 2 with r the width of the Gaussian peak, to which both the width of the component and the two-point resolution of the imaging instrument

29 QUANTITATIVE ATOMIC RESOLUTION TEM 29 contribute. The n c -dimensional parameter vector b is equal to ðb 1... b nc Þ T ¼ðb x1...b xnc Þ T. Suppose that the observations w k ; k ¼ 1;... ; K are made at equidistant pixels of size Dx at the measurement points x k.ifdx is small compared to the width r of the Gaussian peak, the probability p k (b) that an electron hits the pixel at the position x k is approximately given by: p k ðbþ ¼ px ð k ; bþ ¼ Z xk þdx=2 x k Dx=2 fðx; bþdx fðx k ; bþdx: ð26þ This means that the number of electrons expected to be found at this pixel is given by: l k ¼ n c N p p k ðbþ; ð27þ where N p is the total number of electrons in each Gaussian peak. Therefore, Eq. (27) describes the expectation model, which contains the parameters b. 2. Two-Dimensional Observations For two-dimensional observations, two distinct expectation models are assumed, corresponding to the so-called dark-field and bright-field imaging mode in TEM. In dark-field imaging, the noninteracting electrons are eliminated from detection, whereas in bright-field imaging, these electrons contribute to the background intensity in the image. The expectation models for dark-field and bright-field imaging are approximated by a model consisting of Gaussian peaks without and with background, respectively, although they are of a higher complexity in practice. a. Dark-Field Imaging. For dark-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: g DF ðx; y; bþ ¼ 1 X n c n c n¼1 Gx b xn ; y b yn ; ð28þ where b xn and b yn are the x- and y-coordinate of the position of the nth component, respectively, and Gx; ð yþ ¼ 1 2pr exp x2 y 2 ; ð29þ 2 2r 2 with r the width of the Gaussian peak. The 2n c -dimensional parameter vector b is equal to ðb 1...b 2nc Þ T ¼ðb x1...b xnc b y1...b ync Þ T. For a twodimensional object, the components are, for example, atoms or atom

30 30 VAN AERT ET AL. columns in projection. In fact, Eq. (28) results from a two-dimensional convolution between an object function and the point spread function of the electron microscope. The intensity distribution of the identical components of the object as well as the point spread function t(x, y) are assumed to be Gaussian with corresponding widths r C and r EM, respectively. In this case r 2 ¼ r 2 C þ r2 EM : ð30þ The observations w kl ; k ¼ 1;...; K, l ¼ 1;...; L are made at equidistant pixels of area Dx Dy at the measurement points (x k yl ) T. The field of view (FOV), that is, the total area of detection is equal to KDx LDy. IfDx and Dy are small compared to the width r of the Gaussian peak, the probability p kl (b) that an electron hits the pixel at the position (x k yl ) T is approximately given by: p kl ðbþ ¼ px ð k ; y l ; bþ ¼ Z xk þdx=2 Z yl þdy=2 x k Dx=2 y l Dy=2 g DF ðx k ; y l ; bþdxdy: g DF ðx; y; bþdxdy ð31þ For a given total number of electrons N p in each Gaussian peak, the number of electrons expected to be found at this pixel is given by: l kl ¼ n c N p p kl ðbþ: ð32þ This equation describes the expectation model containing the parameters b. b. Bright-Field Imaging. For bright-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: g BF ðx; y; bþ ¼ 1 n cog DF ðx; y; bþ ; ð33þ FOV n c O where O is a constant, representing the strength of the interaction of the electrons with one component, g DF (x, y; b) is described by Eq. (28), and FOV is the field of view. The term 1 represents a constant background, corresponding to the noninteracting electrons and the denominator FOV n c O is a normalization constant. In what follows, the term n c Og DF ðx; y; bþ is assumed to be much smaller than the term 1, which means that the number of interacting electrons is small compared to the number of noninteracting electrons. In analogy with dark-field imaging, the probability p kl (b) that an electron hits the pixel at the position (x k y l ) T is approximately given by:

31 QUANTITATIVE ATOMIC RESOLUTION TEM 31 p kl ðbþ g BF ðx k ; y l ; bþdxdy: ð34þ For a given total number of electrons N, the number of electrons expected to be found at the pixel at the position ðx k y l Þ T is given by: l kl ¼ N pkl ðbþ: ð35þ This result defines the expectation model for bright-field imaging containing the parameters b. 3. Three-Dimensional Observations The three-dimensional observations made at the three-dimensional object consist of a single-axis tilt series of two-dimensional projections recorded by an electron tomography experiment. These projections are obtained by recording two-dimensional images while tilting the object about a fixed axis. Other data collection geometries in electron tomography exist as well, such as conical and random-conical tilting (Frank, 1992). However, only singleaxis tilting is considered here. It is assumed that the three-dimensional density distribution of the object is given by: dx; ð y; z; bþ ¼ 1 X n c n c n¼1 Dx b xn ; y b yn ; z b zn ; ð36þ where b xn, b yn, and b zn are the x-, y-, and z-coordinate, respectively, of the position of the nth component with respect to a reference coordinate system and Dx; ð y; zþ ¼ 1 ð2pþ 3=2 r 3 C exp x2 y 2 z 2 2r 2 ; ð37þ C with r C the width of the identical components. The 3n c -dimensional parameter vector b is equal to ðb 1...b 3nc Þ T ¼ðb x1...b xnc b y1...b ync b z1...b znc Þ T. The components are, for example, atoms. Figure 3 shows the surface of the three-dimensional density distribution and the positions of two components. It will be assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. In the derivation of a rule of thumb for the attainable precision, it will be assumed that the tilt angles y j ; j ¼ 1;...; J are equidistantly located on the interval ( p/2, p/2). Although such a full angular range is rather unrealistic, it will be shown in Section III.D that the derived rules of thumb still provide insight for a limited angular range. At each tilt angle y j, the position coordinates of the components b j ¼ðb j x1...b j xn c b j y1...b j yn c b j z1...b j zn c Þ T with respect to the reference coordinate system are given by:

32 32 VAN AERT ET AL. Figure 3. Surface of the three-dimensional density distribution of an object consisting of two components. The position coordinates of the two components are represented by the elements of the parameter vector b. It has been assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. Furthermore, d is the distance between the two components, d 0 is the distance between the components projected onto the (x, z)-plane, and f is the angle between the rotation axis and the axis connecting both components. It should be mentioned that this is not the tilt angle. b j xn ¼ b xncosy j þ b zn siny j ; b j yn ¼ b yn; b j zn ¼ b zncosy j b xn siny j ; ð38þ for n ¼ 1;...; n c. The normalized image intensity distribution of a twodimensional projection is equal to: Z h j ðx; y; bþ ¼ d x; y; z; b j dz tx; ð yþ ¼ gdf x; y; " j ; ð39þ that is, the convolution of the projected density distribution and the point spread function of the electron microscope. The parameters " j ¼ðb j x1...b j xn c b j y1...b j yn c Þ T are the position coordinates of the components in this projection and the function g DF is given by Eq. (28). It follows from Eq. (39) that each projected image is assumed to be a two-dimensional

33 QUANTITATIVE ATOMIC RESOLUTION TEM 33 dark-field imaging experiment. However, for future research, it would be interesting to consider a bright-field imaging experiment as well, since this imaging mode is often used in practice. The observations w j kl ; k ¼ 1;...; K; l ¼ 1;...; L; j ¼ 1;...; J are made at equidistant pixels of area Dx Dy at the measurement points (x k y l ) T at the tilt angles y j. The FOV of each projection is equal to KDx LDy. IfDx and Dy are small compared to the width r of the projected Gaussian peak, which is defined by Eq. (29), the probability p j klðbþ that an electron hits the pixel at the position (x k y l ) T at the tilt angle y j is approximately given by: p j klðbþ ¼ p j ðx k ; y l ; bþ ¼ Z xk þdx=2 Z yl þdy=2 x k Dx=2 y l Dy=2 h j ðx k ; y l ; bþdxdy: h j ðx; y; bþ dxdy ð40þ It will be assumed that the total number of electrons n c N p is equally distributed over the two-dimensional projections. In this case, the number of electrons at each projection is equal to n c N p =J, where N p =J represents the number of electrons in each projected Gaussian peak. Then, the number of electrons expected to be found at the pixel at the position (x k y l ) T at the tilt angle y j is given by: l j kl ¼ n cn p J pj klðbþ: ð41þ This result describes the expectation model containing the parameters b. In Sections III.B.1, 2, and 3, expectation models have been given for one-, two-, and three-dimensional observations, respectively. These models describe the expected numbers of detected electron counts, that is, the expectations. Notice that for each expectation model, the components have been assumed to be identical. In future research, this may be extended to nonidentical components, representing, for example, objects consisting of diverent types of atoms. Moreover, it will be supposed that the observations, which fluctuate about the expectations, are statistically independent and have a Poisson distribution. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations, is given by Eq. (10), where the total number of observations M is equal to K, K L, and K L J for one-, two-, and three-dimensional observations, respectively. In Section III.C, the CRLB on the variance with which position and distance parameters can be estimated will be derived from the obtained parametric statistical models of the observations. Notice that for three-dimensional objects, the estimation of the position and distance parameters has to be interpreted as follows. The parameter estimates are obtained by adapting

34 34 VAN AERT ET AL. the assembly of projected models, given by Eq. (41), to the experimental projected images with respect to the unknown parameters. This procedure is considered rather than adapting the three-dimensional model, such as that given by Eq. (36), to a three-dimensional reconstruction, which may be obtained by combining the projected images using the so-called weighted back-projection method (Frank, 1992). The reason why this alternative procedure is not considered is because the joint probability density function of the three-dimensional reconstruction is unknown. If the joint probability density function is unknown, the CRLB cannot be computed. C. Approximations of the Cramér-Rao Lower Bound In this section, rules of thumb will be derived for the highest attainable precision with which the position coordinates of an isolated component and the distance between two components can be measured. In other words, the exact expressions for the CRLB, following from Section II.C.1 will be approximated. This will be done for one-, two-, and three-dimensional objects, for which the parametric statistical models of the observations are described in Section III.B. Throughout this section, the words isolated component and two components should not be interpreted in their strict sense. Expressed in a simplified way, it means that neighboring components may be present as long as these components do not overlap with the one or two components considered. Expressed in a correct way, it means that the elements of the Fisher information matrix associated with a position coordinate of the one or two components considered and a position coordinate of a neighboring component are equal to zero. Hence, the Fisher information matrix and its inverse, the CRLB, are block diagonal. In the derivation of the approximations of the CRLB, only their submatrices need to be considered. An interpretation of a block diagonal CRLB may easily be given for a (hypothetical) estimator with covariance matrix equal to the CRLB. Then, the zero-elements of the CRLB associated with two diverent position coordinates indicate that the estimates of these position coordinates are uncorrelated. 1. One-Dimensional Observations For a one-dimensional object, the approximations of the CRLB on the variance s 2 b x of the position b x of an isolated component and on the variance s 2 d of the distance d between two components may be directly obtained from the results presented in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999). In this paper, the same expectation model as the one

35 QUANTITATIVE ATOMIC RESOLUTION TEM 35 described in Section III.B.1 was used, but the observations were assumed to be multinomially distributed instead of Poisson distributed. However, it can be shown that the expressions for the elements of the Fisher information matrix are equal under both assumptions and given by Eq. (12). Therefore, also the approximations of the CRLB are equal. For an isolated component, the CRLB on the variance s 2 b x of the position b x is approximated by: s 2 b x r2 ð42þ N p where r is the width of the Gaussian peak, which is defined by Eq. (25), and N p is the total number of electrons in this peak. The conditions for the validity of this approximation are that the pixel size Dx is small compared to the width of the Gaussian peak and that the component is located for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. For two components, under the same conditions, the CRLB on the variance s 2 d of the distance d between these components is approximated by: s 2 d 4r4 p N p d 2 if d ffiffi 2 r ð43þ and s 2 d 2r2 p if d ffiffi 2 r ð44þ N p pffiffi If d is equal to 2 r, both approximations are equal to one another. From the comparison of Eqs. (42) and p(44), it follows that, if the distance between two components is larger than ffiffiffi 2 r, s 2 d is twice pffiffi as large as s 2 b x. This expresses the fact that for distances larger than 2 r, a (hypothetical) estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. From Eqs. (42) (44), it follows that the precision with which the position or the distance can be measured is a function of the total number of electrons N p in each component and the width r of the Gaussian peaks. The precision may be improved, that is, s 2 b x or s 2 d may be decreased, by increasing the number of electrons. Also, the precision will improve if pthe ffiffi peaks are narrower. Moreover, if the distance becomes smaller than 2 r, the lower bound on the standard deviation s d of the distance increases inversely proportionally to the distance. In (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999), it has been shown that the approximations given in this section are useful rules of thumb.

36 36 VAN AERT ET AL. 2. Two-Dimensional Observations For a two-dimensional object, the approximations of the CRLB on the variance s 2 b x or s 2 b y of the position coordinates b x or b y, respectively, of an isolated component and on the variance s 2 d of the distance d between two components will be derived for both dark-field and bright-field imaging, for which the expectation models are described in Section III.B.2. The derivations of these lower bounds are similar to those of the onedimensional object. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ðb x b y Þ T. It will be assumed that this component is located for the most part within the FOV, which means that detection of only one half of an object is not considered. Moreover, the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is defined by Eq. (29). Under these assumptions, the (1, 1)th element of the Fisher information matrix F associated with the position coordinates b is approximately equal to its (2, 2)th element, that is, F 11 F 22. The reason for this is that the component has rotational symmetry. Furthermore, it follows from Eq. (12) that the Fisher information matrix F is symmetric. Therefore, F simplifies into: F F 11 F 12 : ð45þ F 12 F 11 From Eq. (14), it follows that the CRLB on the variance s 2 b x or s 2 b y is given by the corresponding diagonal element of F 1 : s 2 b x ¼ s 2 b y ¼ F 1 11 : ð46þ The right-hand member of this equation will be calculated explicitly for dark-field as well as for bright-field imaging, resulting in Eqs. (65) and (68), respectively. Second, two components are considered, for which the position coordinates are represented by the elements of the parameter vector b ¼ðb x1 b x2 b y1 b y2 Þ T. It will be assumed that the components are located for the most part within the FOV and that the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak. Under these assumptions, it can be shown for the elements of the Fisher information matrix F associated with the position coordinates b that F 11 F 22, F 33 F 44, F 24 F 13, and F 23 F 14. The reason for this is that the components are assumed to be identical and hence interchangeable. Furthermore, F is symmetric. Therefore, F simplifies into:

37 QUANTITATIVE ATOMIC RESOLUTION TEM F 11 F 12 F 13 F 14 F F 12 F 11 F 14 F 13 B F 13 F 14 F 33 F 34 A F 14 F 13 F 34 F 33 ð47þ The purpose is to find an expression for the CRLB on the variance s 2 d of the distance between two components. For a two-dimensional object, the distance is defined as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ ðb x1 b x2 Þ 2 2 þ b y1 b y2 : ð48þ Since d is a function of the elements of the parameter vector b, an expression for s 2 d follows directly from the right-hand member of inequality (15): s 2 d ; ð49þ where the Jacobian T is T ¼ 1 d b x1 b x2 b x2 b x1 b y1 b y2 b y2 b y1 : ð50þ Equation (49) may be written as: s 2 d 2! d 2 b 1 F x1 b x2 b y1 b 11 F 12 F 13 F 14 b x1 b x2 y2 : ð51þ F 13 F 14 F 33 F 34 b y1 b y2 The derivation of Eq. (51) is given below. It should be noted that this derivation may be skipped during a first reading without losing the thread of this article. Derivation of Equation (51). The derivation of Eq. (51) is based on the fact that the T T Fisher information matrix F may easily be transformed into a block diagonal matrix F D if F is invariant under a transformation of the parameters b to Mb, where the T T matrix M represents a symmetry operation. This supposition will first be proven. The condition that F is invariant under a symmetry operation M is mathematically written as: F ¼ M T FM; where the matrix M has the property M n ¼ I ð52þ ð53þ

38 38 VAN AERT ET AL. with n an integer and I the identity matrix. Next, suppose that the eigenvectors and eigenvalues of M are represented by the columns Y i ; i ¼ 1;...; T of the T T matrix V and the elements l i ; i ¼ 1;...; T of the T T diagonal matrix L, respectively, or equivalently, in symbols: MV ¼ VL: Then, it follows from Eq. (54) that M n V ¼ VL n : Furthermore, it follows from Eq. (53) that M n V ¼ V: Combining Eqs. (55) and (56) results in: ð54þ ð55þ ð56þ L n ¼ I: ð57þ This means that the eigenvalues of M are equal to exp(i2pr/n), with r ¼ 0; 1;...; n 1. Since the dimension T of the Fisher information matrix F is usually larger than n, these eigenvalues are degenerated. From Eqs. (52) and (54), it follows that: V T FV ¼ L T V T FVL: ð58þ The notation F D will be used to indicate the matrix V T FV. It will now be shown that F D is block diagonal. The (i, j)th element of F D, represented by Yi T FY j, is calculated by subsequent use of Eqs. (52) and (54) as follows: ðf D Þ ij ¼ Yi T FY j ¼ Yi T M T FMY j ¼ l i l jyi T FY j ; ð59þ where the symbol * denotes the complex conjugate. Thus, Yi T FY j is equal to l i l jyi T FY j. This relation is trivial if Y i and Y j have the same eigenvalue, since then l i l j ¼ 1. If, on the other hand, Y i and Y j have diverent eigenvalues, Yi T FY j has to be equal to 0, since l i l j 6¼ 1. Therefore, the matrix F D ¼ V T FV is block diagonal, which proves the supposition. The supposition, which is discussed above, will now be used to derive Eq. (51). Since the two components are assumed to be identical, the Fisher information matrix F, given by Eq. (47), is invariant under interchanging the components. Thus, the matrix M, which represents this symmetry operation, is given by: M ¼ B A : ð60þ

39 QUANTITATIVE ATOMIC RESOLUTION TEM 39 The matrix of eigenvectors V of M and the matrix of eigenvalues L of M are given by: V ¼ p ffiffi B C A ð61þ and L ¼ B A : ð62þ The matrix F D ¼ V T FV is block diagonal, as predicted by the preceding supposition, and is equal to 0 1 F 11 þ F 12 F 13 þ F F 13 þ F 14 F 33 þ F B 0 0 F 11 F 12 F 13 F 14 A : ð63þ 0 0 F 13 F 14 F 33 F 34 Since F D is defined as V T FV, it follows that: F 1 ¼ VF 1 D V T ; ð64þ where F D is given by Eq. (63). Equation (64) allows one to easily invert the 4 4 Fisher information matrix F associated with the position coordinates b since F D is block diagonal. The inverse of F D is block diagonal as well, with submatrices equal to the inverse of the 2 2 submatrices of F D. Next, the result of Eq. (64) is substituted into Eq. (49) resulting into Eq. (51). Next, the right-hand member of Eq. (51) will be calculated explicitly for distances, which are either small or large compared to the width r of the Gaussian peak, and for dark-field, as well as for bright-field imaging. Dark-Field Imaging. The CRLB on the variance s 2 b x or s 2 b y of the position coordinates b x or b y, respectively, of an isolated component and on the variance s 2 d of the distance d between two components are given for dark-field imaging. The results are obtained from the explicit calculations of the expressions given by the right-hand members of Eqs. (46) and (51), which are given in Appendix A. For an isolated component, the CRLB on the variance s 2 b x or s 2 b y is approximated by:

40 40 VAN AERT ET AL. s 2 b x ¼ s 2 b y r2 N p ð65þ where r is the width of the Gaussian peak, which is defined by Eq. (29), and N p is the total number of electrons in this peak. For two components, the CRLB on the variance s 2 d of the distance d between these components is approximated by: s 2 d 4r4 p N p d 2 if d ffiffiffi 2 r ð66þ and s 2 d 2r2 p if d ffiffi 2 r ð67þ N p p ffiffiffiffiffi If d is equal to 2r, both approximations are equal to one another. Notice that Eqs. (65), (66), and (67) are equal to their one-dimensional analogues, which are given by Eqs. (42), (43), and (44). Moreover, from the comparison of Eqs. (65) and (67), itp follows that, if the distance between the two components is larger than ffiffiffi 2 r, s 2 d is twice p as large as s 2 b x or s 2 b y. This expresses the fact that for distances larger than ffiffiffi 2 r, an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s 2 b x, s 2 b y, and s 2 d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. Bright-Field imaging. The CRLB on the variance s 2 b x or s 2 b y of the position coordinates b x or b y, respectively, of an isolated component and on the variance s 2 d of the distance d between two components are given for bright-field imaging. The results are obtained from the explicit calculations of Eqs. (46) and (51), which are given in Appendix B. For an isolated component, the CRLB on the variance s 2 b x or s 2 b y is approximated by: s 2 b x ¼ s 2 b y 8pr4 FOV NO 2 ð68þ where r is the width of the Gaussian peak, which is defined by Eq. (29), FOV is the field of view, N is the total number of detected electrons, and O represents the strength of the interaction of the incident electrons with one component. Notice that N/FOV denotes the total number of detected electrons per unit area. For two components, the CRLB on the variance s 2 d of the distance d between these components is approximated by:

41 QUANTITATIVE ATOMIC RESOLUTION TEM 41 rffiffiffiffiffiffi s 2 d 64pr6 FOV 4 3NO 2 d 2 if d 3 r ð69þ and rffiffiffiffiffiffi s 2 d 16pr4 FOV 4 NO 2 if d 3 r ð70þ pffiffiffiffiffiffiffi If d is equal to 4=3 r, both approximations are equal to one another. From the comparison of Eqs. (68) and (70), pffiffiffiffiffiffiffi it follows that, if the distance between the two components is larger than 4=3 r, s 2 d is twice as plarge ffiffiffiffiffiffiffias s 2 b x or s 2 b y. This expresses the fact that for distances larger than 4=3 r,an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s 2 b x, s 2 b y, and s 2 d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. The rules of thumb for dark-field and bright-field imaging, which are described by Eqs. (65) (70), are scalar measures that may be used to obtain insight into statistical experimental design. The precision with which the position coordinates and the distance can be measured is a function of the number of electrons and the width of the peaks. The attainable precision may be quantified and improved by increasing the number of detected electrons per unit area or by narrowing the peaks. In practice, it follows from Eq. (30) that the peaks may be narrowed by narrowing the point spread function, that is, by improving the two-point resolution of the electron microscope. However, it is important to notice that below a certain width of the point spread function, the precision is limited by the intrinsic width of the components, for instance, by the width of the electrostatic potential of the atoms (den Dekker, Sijbers, and van Dyck, 1999). Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower point spread function may be accompanied with a decrease of the number of detected electrons, both evects have to be weighed against each other under the existing physical constraints. Also, from the rules of thumb, it follows that the precision may be orders of magnitude better than the two-point resolution of the imaging instrument if the number of detected electrons pper ffiffi unit parea ffiffiffiffiffiffiffi is large. Furthermore, if the distance becomes smaller than 2 r or 4=3 r for dark

42 42 VAN AERT ET AL. field and bright field imaging, respectively, the lower bound on the standard deviation s d of the distance d increases inversely proportionally to the distance. In Section III.D.1 which consists of discussions and examples, it will be shown that the lower bounds on the standard deviation s bx or s by of the position coordinates b x or b y, respectively, of an isolated component and on the standard deviation s d of the distance d is well approximated by the square roots of the right-hand members of Eqs. (65) (70). 3. Three-Dimensional Observations For a three-dimensional object, the derivation of rules of thumb for the highest attainable precision, that is, the CRLB, with which the position coordinates of an isolated component or the distance between two components can be estimated is similar to its two-dimensional analogue. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ðb x b y b z Þ T. The symmetric Fisher information matrix F associated with the position coordinates b is given by: 0 1 F 11 F 12 F 13 F F 12 F 22 F 23 A: ð71þ F 13 F 23 F 33 From Eq. (14), it follows that the CRLB on the variance s 2 b x, s 2 b y or s 2 b z of the position coordinates b x, b y or b z, respectively, is given by its corresponding diagonal element of F 1 : s 2 b x ¼ F 1 11 ; s 2 b y ¼ F 1 22 ; ð72þ ¼ F 1 s 2 b x The right-hand members of these equations are calculated explicitly in Appendix C, resulting in: 33 : and s 2 b x ¼ s 2 b x 2r2 N p ð73þ s 2 b y r2 ð74þ N p where r is the width of the projected Gaussian peak, which is defined by Eq. (29), and N p is the total number of detected electrons in the component. The conditions for the validity of the approximations are that the pixel sizes

43 QUANTITATIVE ATOMIC RESOLUTION TEM 43 Dx and Dy are small compared to r, that the diverence Dy between successive tilt angles is small compared to the full angular tilt range, and that the component is located for the most part within the region of observation. Furthermore, the tilt angles y j are assumed to be equidistantly located on the interval ( p/2, p/2). From the comparison of Eqs. (73) and (74) with Eqs. (42) and (65), it follows that the lower bound on the variance with which the y-coordinate or the x- and z-coordinates of the position can be estimated is equal to or twice as large as their one- and two-dimensional analogues, respectively. Recall that the y-coordinate is the coordinate along the rotation axis and that the x- and z-coordinates are the coordinates perpendicular to the rotation axis. Second, two components, for which their position coordinates are represented by the elements of the parameter vector b ¼ðb x1 b x2 b y1 b y2 b z1 b z2 Þ T, are considered. It will be assumed that the three-dimensional components are located for the most part within the region of observation and that the pixel sizes Dx and Dy are small compared to the width r of the projected Gaussian peak. Furthermore, the Fisher information matrix F associated with the position coordinates b is a symmetric matrix. Under the assumptions given above and the symmetry property of the Fisher information matrix, it may be shown that 0 1 F 11 F 12 F 13 F 14 F 15 F 16 F 12 F 11 F 14 F 13 F 16 F 15 F F 13 F 14 F 33 F 34 F 35 F 36 F 14 F 13 F 34 F 33 F 36 F 35 : ð75þ B F 15 F 16 F 35 F 36 F 55 F 56 A F 16 F 15 F 36 F 35 F 56 F 55 The reason for this is that the components are assumed to be identical and hence interchangeable. The purpose is to find an expression for the CRLB on the variance s 2 d of the distance between two components. For a threedimensional object, the distance is defined as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ ðb x1 b x2 Þ 2 2 þ b y1 b y2 þ ð bz1 b z2 Þ 2 : ð76þ Since d is a function of the elements of the parameter vector b, an expression for s 2 d follows directly from the right-hand member of inequality (15): s 2 d ; where the Jacobian T is equal to ð77þ

44 44 VAN AERT T ¼ 1 d b x1 b x2 b x2 b x1 b y1 b y2 b y2 b y1 b z1 b z2 b z2 b z1 : ð78þ Following the same lines of thought as in the derivation of Eq. (51), it can be shown that s 2 d may be approximated by: s 2 d 2 d 2 b x1 b x2 b y1 b y2 b z1 b z2 0 1 F 11 F 12 F 13 F 14 F 15 F 16 B F 13 F 14 F 33 F 34 F 35 F 36 A b x1 b x2 B b y1 b y2 A: ð79þ F 15 F 16 F 35 F 36 F 55 F 56 b z1 b z2 The expression given by the right-hand member of Eq. (79) has been calculated explicitly in Appendix C for the special cases where the distance between the two components is small or large compared to the width r of the projected Gaussian peak. This results in the following rules of thumb: s 2 d 4r4 N p d 2 V ð f pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Þ if d 2VðfÞ=WðfÞr ð80þ and s 2 d 2r2 p WðfÞ if d ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2VðfÞ=WðfÞr ð81þ N p where VðfÞ ¼ 4 3cos4 f 3cos 2 f 2 cos 4 f 6cos 2 f 3 ; ð82þ WðfÞ ¼ 1 þ sin 2 f; ð83þ and f is the angle between the rotation axis and the axis that connects the two components. This angle has been visualized in Figure 3. It should be mentioned that this is not the tilt angle. For diverent tilt angles y j in a tilt series, f is constant. The conditions for the validity of the approximations are that the components are located for the most part within the region of observation, that the pixel sizes Dx and Dy are small compared to r, and that the diverence Dy between successive tilt angles is small compared to the full angular tilt range. Furthermore, the tilt angles y j are assumed to pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi be equidistantly located on the interval ( p/2, p/2). If d is equal to 2VðfÞ=WðfÞr, for a given angle f, both approximations are equal to one another. From Eqs. (80) and (81), it follows that the precision with which

45 QUANTITATIVE ATOMIC RESOLUTION TEM 45 the distance can be estimated is a function of the total number of electrons, the width of the peaks, the distance between the components, and the angle f. If f is equal to p/2, the approximated s 2 d is about 2 times as large as if fp ffiffi is equal to 0. In terms of the standard deviation this corresponds to a factor 2. Moreover, if f is equal to 0, the approximations given by Eqs. (80) (81) are equal to their one- and two-dimensional analogues given by Eqs. (43) (44) and (66)-(67), respectively. This is intuitively clear since the components are then on the rotation axis and therefore the distance between the components in a two-dimensional projection is at each tilt angle equal to the real distance. In Section III.D it will be shown that the lower bounds on the standard deviation s bx, s by or s bx of the position coordinates b x, b y or b z, respectively, of an isolated component and on the standard deviation s d of the distance d between two components is well approximated by the square roots of the right-hand members of Eqs. (73) (74) and (80) (81), respectively. D. Discussions and Examples In this section, the exactly calculated lower bounds on the standard deviation of the position coordinates of an isolated component and on the standard deviation of the distance will be compared with its approximations. This will be done for two- and three-dimensional objects. For onedimensional objects, a discussion may be found in Bettens, van Dyck, den Dekker, Sijbers, and van den Bos (1999). 1. Two-Dimensional Observations The approximations of the lower bound on the standard deviation, which are derived in Section III.D.2 for two-dimensional objects, will be investigated by means of examples, for dark-field as well as for brightfield imaging. a. Dark-Field Imaging. The approximations of the lower bound on the standard deviation s bx and s by of the position coordinates b x and b y of an isolated component and on the standard deviation s d of the distance d between two components, which are described by the square roots of the right-hand members of Eqs. (65)-(67), are discussed for dark-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of dark-field imaging observations, which is derived in Section III.B. into the obtained expressions. Unless otherwise

46 46 VAN AERT ET AL. stated, the total number of electrons in a Gaussian peak, the width of this peak, the pixel sizes, and the field of view are given by the numbers of Table 1. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 4 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the width of the Gaussian peak. Furthermore, Figure 5 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (65) (67) are accurate approximations of s bx, s by, and s d. One of the assumptions that is made in the derivation of Eqs. (65) (67) is that the pixel sizes Dx and Dy are small compared to the width of the TABLE 1 Total Number of Electrons in a Gaussian Peak (N p ), the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) N p r Dx Dy FOV 15, Figure 4. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (65), as a function of the width of the Gaussian peak.

47 QUANTITATIVE ATOMIC RESOLUTION TEM 47 Gaussian peak. Therefore, Figure 6 shows the exactly calculated lower bound on the standard deviation s d of the distance as a function of the pixel size Dx, which has been assumed to be equal to Dy. The distance between the two components is equal to 10. From this figure, it is seen that below a certain pixel size, s d decreases only slightly with decreasing pixel size, with Figure 5. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (66) and (67), as a function of the distance between two components. Figure 6. The exactly calculated lower bound on the standard deviation of the distance as a function of the pixel size Dx, with Dy ¼ Dx. The distance is equal to 10.

48 48 VAN AERT ET AL. all other quantities kept constant. Hence, the precision that is gained by decreasing the pixel size is only marginal. This was also observed for onedimensional observations by Bettens et al. (1999). This has to do with the fact that the pixel signal-to-noise ratio (SNR) decreases with decreasing pixel size. Finally, it is examined if there exists an estimator attaining the CRLB on the variance of position coordinates and on the variance of the distance and if this estimator may be considered unbiased. If so, this would justify the choice of the CRLB as precision based optimality criterion. Generally, one may use diverent estimators in order to estimate the position coordinates or the distance such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diverent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with a covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in the examples given above is finite and even relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 600 diverent dark-field experiments made at an isolated component are simulated; the observations are modelled using the parametric statistical model described in Section III.B. Next, the position coordinates b x and b y of the component are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 2. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. The maximum likelihood estimates of b x are presented in the histogram of Figure 7. The solid curve represents a normal distribution with mean and variance given in Table 2. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eycient estimates. These results justify the choice of the CRLB as optimality criterion.

QUANTITATIVE ATOMIC RESOLUTION TEM 49 TABLE 2 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the

49 QUANTITATIVE ATOMIC RESOLUTION TEM 49 TABLE 2 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate Estimated mean Standard deviation of mean b x 0 b x b y 0 b y Lower bound on variance Estimated variance Standard deviation of variance s 2 b x s 2 b x s 2 b y s 2 b y The numbers of the last column represent the estimated standard deviation of the variable of the previous column. Figure 7. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of a component. The normal distribution superimposed on this histogram makes plausible that the estimates are normally distributed. b. Bright-Field Imaging. The approximations of the lower bound on the standard deviation s bx and s by of the position coordinates b x and b y of an isolated component and on the standard deviation s d of the distance d between two components, which are described by the square roots of the

50 50 VAN AERT ET AL. right-hand members of Eqs. (68) (70), are discussed for bright-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of bright-field imaging observations, which is derived in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of electrons, the width of the Gaussian peak, the constant representing the strength of the interaction, the pixel sizes, and the field of view are given by the numbers of Table 3. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 8 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the constant O, which represents the strength of the interaction. Furthermore, Figure 9 shows the exactly calculated lower bound on the TABLE 3 The Total Number of Electrons (N ), the Width ( r) of the Gaussian Peak, the Constant (O) Representing the Strength of the Interaction, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) N r O Dx Dy FOV 18,000, Figure 8. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (68), as a function of the constant O representing the interaction strength.

51 QUANTITATIVE ATOMIC RESOLUTION TEM 51 Figure 9. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (69) and (70), as a function of the distance between two components. standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (68) (70) are accurate approximations of s bx, s by, and s d. Like for dark-field imaging, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the distance between two components and if it is unbiased. The observations made at these components are modelled using the parametric statistical model for bright-field imaging described in Section III.B. From 600 diverent simulation experiments, the distance is estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the distance and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 4. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. 2. Three-Dimensional Observations The approximations of the lower bounds on the standard deviation s bx, s by, and s bz of the position coordinates b x, b y,andb z of an isolated component and on the standard deviation s d of the distance d between two components, which are described by the square roots of the right-hand members of

52 52 VAN AERT ET AL. TABLE 4 Comparison of True Distance and Lower Bound on the Variance with Estimated Mean and Variance of 600 Maximum Likelihood Estimates of the Distance between Two Components, Respectively True distance Estimated mean Standard deviation of mean d 60 d Lower bound on variance Estimated variance Standard deviation of variance s 2 d 1.27 s 2 d The numbers of the last column represent the estimated standard deviation of the variable of the previous column. TABLE 5 The Total Number of Projected Images (J ), the Number of Electrons in each Projected Gaussian Peak ðn p =JÞ, the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), the Field of View (FOV ) of each Projected Image, and the Angle(f) between the Rotation Axis and the Axis Connecting Two Components J N p r Dx Dy FOV f 20 15, p=2 Eqs. (73), (74), (80), and (81) in Section III.C.3, are investigated by means of examples. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of the three-dimensional observations, which is given in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of projected images, the number of electrons in each projected Gaussian peak, the width of this peak, the pixel sizes, the field of view of each projected image, and, in case of two components, the angle between the rotation axis and the axis connecting these components are given by the numbers of Table 5. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 10 shows the exactly calculated lower bound on the standard deviation of the position coordinates and its approximations as a function of the width of the projected Gaussian peak, which is described by Eq. (29). Furthermore, Figure 11 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. The axis combining both

53 QUANTITATIVE ATOMIC RESOLUTION TEM 53 Figure 10. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximations, given by the square roots of the right-hand members of Eqs. (73) and (74), as a function of the width of the projected Gaussian peak. Figure 11. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between two components. components is assumed to be perpendicular to the rotation axis. Moreover, in Figures 12 and 13, the exactly calculated lower bound on the standard deviation of the distance and its approximations are shown as a function of the angle f, for the distance between the components being small and large compared to the width of the projected Gaussian peak, respectively. From Figures 10 to 13, it is observed that the square roots of the right-hand

54 54 VAN AERT ET AL. Figure 12. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (80), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The distance is equal to 2. Figure 13. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (81), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The width of the projected Gaussian peaks is equal to 10 and the distance is equal to 50. members of Eqs. (73), (74), (80), and (81) are accurate approximations of s bx, s by, s bz,ands d. Next, some remarks are due. It should be mentioned that in the derivation of the approximations of the CRLB, the diverence Dy between successive tilt

55 QUANTITATIVE ATOMIC RESOLUTION TEM 55 angles has been assumed to be small compared to the full angular tilt range ( p/2, p/2), or in other words, the total number of projections has been assumed to be large, which is rather unrealistic. However, in the comparisons presented in Figures 10 to 13, the exactly calculated lower bounds on the standard deviation follow from the assumption that there are only 20 available projections. This shows that the approximations are useful, even for a limited number of projections. Additionally, Figure 14 shows the exactly calculated lower bound on the standard deviation of the distance s d as a function of the total number of projections, with all other parameters kept constant. It is seen that there is a fast convergence of s d to a constant with increasing number of projections. This means that the precision does not improve beyond a certain number of projections. The reason for this is that the number of electrons per projection decreases with increasing number of projections since the total number of electrons has been kept constant. Therefore, the pixel SNR decreases with increasing number of projections. Furthermore, in the derivation of the approximations, a full angular tilt range, that is, the interval ( p/2, p/2), has been assumed, which is also unrealistic. Therefore, Figure 15 shows the exactly calculated s d, following from a limited angular tilt range, that is, the interval ( p/3, p/3), and the approximations as a function of the distance between the two components. Although the approximations start to deviate from the exactly calculated s d, they are still useful as rule of thumb since they describe the behaviour of s d well. Figure 14. The exactly calculated lower bound on the standard deviation of the distance as a function of the number of projections J, with the number of electrons in each projected Gaussian peak N p /J. The width of this peak is equal to 10 and the distance between the two components is equal to 40.

56 56 VAN AERT ET AL. Figure 15. The exactly calculated lower bound on the standard deviation of the distance, assuming a limited angular tilt range, that is, the interval ( p/3, p/3), and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between the two components. Finally, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the position coordinates of an isolated component and if it is unbiased. The significance of this has been made clear earlier in Section III.D.1. The three-dimensional observations made at the component are modelled using the parametric statistical model described in Section III.B. The width of the projected Gaussian peaks is equal to 10. From 600 diverent simulation experiments, the position coordinates are estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 6. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Additionally, a remark on maximum likelihood estimation has to be made. Maximum likelihood estimates are given by the values that maximize the log-likelihood function, as shown in Section II.D. However, in order to avoid ending up at a local maximum, instead of at the global maximum of the log-likelihood function, it is important to have good starting values for the position coordinates of the components, as already mentioned in Section I. For that purpose, a three-dimensional reconstruction could be

57 QUANTITATIVE ATOMIC RESOLUTION TEM 57 TABLE 6 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate Estimated mean Standard deviation of mean b x 0 b x b y 0 b y b z 0 b z Lower bound on variance Estimated variance Standard deviation of variance s 2 b x s 2 b x s 2 b y s 2 b y s 2 b z s 2 b z The numbers of the last column represent the estimated standard deviation of the variable of the previous column. useful. It may be obtained by combining the projected images using the socalled weighted back-projection method (Frank, 1992). E. Conclusions The attainable precision with which position and distance parameters of one or two components can be estimated is computed for simulations of highresolution CTEM, STEM, and electron tomography experiments, all described by simplified models. Usually, the performance of such atomic resolution TEM experiments is discussed in terms of two-point resolution, expressing the possibility of perceiving separately components of a twopoint image. Although such resolution based criteria are suitable to set up qualitative atomic resolution TEM experiments, a precision based optimality criterion is needed in the framework of quantitative atomic resolution TEM. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In the simulation experiments, the observations were assumed to be electron counting results made at Gaussian peaks with unknown position. Under this assumption, the CRLB, which is usually calculated numerically, is given by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. The

58 58 VAN AERT ET AL. rules of thumb show how the attainable precision depends on the width of the point spread function, the width of the components, the number of detected electrons, and on the distance between the components. Particularly for electron tomography experiments, it is a function of the orientation of the components with respect to the rotation axis as well. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. Moreover, if a narrower point spread function results into a decrease of the number of detected electrons, both evects have to be weighed against each other under the existing physical constraints. In the following sections, the optimal statistical experimental designs of CTEM and STEM experiments, assuming more realistic expectation models than Gaussian peaks, will be derived by computing the CRLB numerically. It will be shown that these numerical results may be interpreted by means of the obtained rules of thumb of this section. IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy A. Introduction Optimal statistical experimental designs of CTEM experiments will be described. As mentioned in Section I the future of such experiments is quantitative structure determination. Unknown structure parameters, atom column positions in particular, are quantitatively estimated from the observations. Quantitative structure determination should be done as precisely as possible. A precision of the atom column positions of the order of 0.01 to 0.1 Å is needed (Kisielowski, Principe, Freitag and Hubert, 2001; Muller, 1998, 1999). Precise measurements will allow materials scientists to draw reliable conclusions from the experiment. Such measurements may be used for comparison with or as an input for theoretical first-principles calculations in order to get a deeper understanding of the properties-structure relation. Hence, the experimental design of CTEM experiments should be evaluated and optimized in terms of precision. As shown in Section II, the obvious optimality criterion is the attainable precision, that is, the CRLB, with which the atom column positions can be estimated. The attainable precision should replace widely used performance criteria of an electron microscope, which express the

59 QUANTITATIVE ATOMIC RESOLUTION TEM 59 possibility to perceive separately two atom columns in an image. Although these criteria are suitable to set up qualitative CTEM experiments, the attainable precision is needed as a criterion in the framework of quantitative CTEM experiments. In Section III, the attainable precision has been derived in closed analytical form for atomic resolution transmission electron microscopy experiments using simplified models. In this section, the attainable precision will be derived for more complicated, physics based CTEM models and the obtained expression will be used to evaluate and optimize the experimental design. To begin with, it will be described how CTEM observations are collected. A scheme is shown in Figure 16. The object under study is illuminated by a parallel incident electron beam. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. A one-to-one correspondence between the exit wave and the projected object structure is established if the object is oriented along a main zone axis and if the distance between adjacent atom columns is not too small. Next, a magnified image of the exit wave is formed by a set of lenses of which the objective lens is the most important one. The formation of this image may be described in two steps. First, the so-called image wave, which is a complex electron wave function at the image plane, is formed. Since the objective lens is not perfect, the image wave is influenced by lens aberrations such as spherical Figure 16. Scheme of a CTEM experiment.

60 60 VAN AERT ET AL. aberration, defocus, and chromatic aberration. Second, the image intensity distribution, given by the modulus square of the image wave, is recorded. As a recording device, a CCD camera may be chosen. Therefore, CTEM observations may be considered to be electron counting results collected at the pixels of a CCD camera. Widely used performance criteria of CTEM experiments are the point resolution and the information limit of the electron microscope. The point resolution r s represents the smallest detail that may be interpreted directly from the image provided that the object is thin and that the defocus is adjusted to the so-called Scherzer defocus (Scherzer, 1949). The point resolution depends only on the spherical aberration constant C s and the electron wavelength l, according to the formula r s ¼ 0:66ðC s l 3 Þ 1=4 (Spence, 1988). The information limit r i represents the smallest detail that is present in the image and that may be resolved by image processing techniques such as ov-axis holography (Lichte, 1991) and the focal-series reconstruction method (Coene, Thust, Op de Beeck, and van Dyck, 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck and Coene, 1993). Both techniques retrieve the exit wave, which ideally is free from any lens aberration. The information limit is inversely proportional to the highest spatial frequency that is still transferred with enough intensity from the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O Keefe, 1992). Usually, the information limit is smaller than the point resolution in intermediate voltage electron microscopy. The information limit is mainly determined by spatial incoherence and temporal incoherence. Spatial incoherence is due to beam convergence, which is caused by the fact that the illuminating beam is not parallel but may be considered as a cone of incoherent plane waves. Temporal incoherence is due to chromatic aberration, which results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron, where the thermal energy fluctuation is often the dominating term. Chromatic aberration will mostly be the dominant factor governing the information limit (de Jong and van Dyck, 1993). The information limit due to chromatic aberration is defined as r i ¼ðplD=2Þ 1=2, with D the defocus spread, expressed in terms of the standard deviation (Spence, 1988). Over the years, diverent methods have been developed to improve the point resolution or the information limit. Existing methods to improve the point resolution are, for example, high-voltage electron microscopy and correction of the spherical aberration. High-voltage electron microscopy is based on the principle that an increase of the accelerating voltage is accompanied with a decrease of the electron wavelength and a corresponding improvement of the point resolution (Phillipp, Höschen, Osaki, Möbus,

61 QUANTITATIVE ATOMIC RESOLUTION TEM 61 and Rühle, 1994). Spherical aberration is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using a combination of magnetic quadrupole and octopole lenses, spherical aberration may be cancelled out (Rose, 1990; Scherzer, 1949). This improves the point resolution. One of the advantages of the spherical aberration corrector is that structure-imaging artifacts due to contrast delocalization may to a great extent be avoided (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). Existing methods to improve the information limit are based on correction of chromatic aberration by use of either a chromatic aberration corrector (Reimer, 1984; Weißbäcker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999). The chromatic aberration corrector is still at the conceptual stage. The monochromator is already used in practice and eliminates all electrons having energies outside a prespecified energy range. The methods presented nowadays result in a resolution of about 1 Å, which is suycient to visualize the individual atom columns of materials with columnar structures, viewed along a main zone axis. In fact, the methods developed to improve the point resolution or the information limit are advantageous for qualitative high-resolution CTEM. However, the future of CTEM experiments, is quantitative, instead of qualitative, structure determination. The structure parameters, the atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. Hence, the obvious optimality criterion to be used to evaluate the experimental design of CTEM experiments is the attainable precision, that is, the CRLB with which these structure parameters can be estimated, and not so much the point resolution or the information limit. In this section, optimal statistical experimental designs of CTEM experiments will be computed in terms of the experimental settings producing the highest attainable precision. It will be obtained using the principles of statistical experimental design as explained in Section II. The section is organized as follows. In Section IV.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section IV.C, it will be shown how the CRLB on the variance of the atom column position estimates may be deduced from this model. Afterward, an adequate optimality criterion, which is a function of the elements of the CRLB, will be given. This criterion is then used to evaluate and optimize the experimental design. Special attention is paid to the dependence of the optimality criterion on the use of a spherical aberration corrector, a chromatic aberration corrector, and a monochromator. In Section IV.D, conclusions are drawn.

62 62 VAN AERT ET AL. Part of the results of this section concerning the use of a monochromator has earlier been published in den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2000) and den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2001). B. Parametric Statistical Model of Observations In order to derive the optimal statistical experimental design, a parametric statistical model of the CTEM observations is needed. This model, which contains microscope settings such as defocus, spherical aberration constant, chromatic aberration constant, and defocus spread, as well as structure parameters such as the atom column positions and the object thickness, will be derived in this section. In this derivation, two basic approximations will be made. The first approximation is the use of the simplified channelling theory to describe the dynamical scattering of the electrons on their way through the object (Geuens and van Dyck, 2002; van Dyck and Op de Beeck, 1996). Secondly, partial spatial and temporal coherence will be incorporated by representing the microscope s transfer function as a product of the corresponding coherent transfer function and two envelope functions (Fejes, 1977; Frank, 1973). The image calculation is then treated as a simple Fourier optics scheme. This approach is nowadays called the quasi-coherent approximation (Coene and van Dyck, 1988). Admittedly, the approximations made are of a limited validity. However, they are very useful for a compact analytical model-based derivation of the optimal statistical experimental design of quantitative CTEM experiments as well as for explaining the basic principles governing the obtained results. The principal results obtained are independent of the approximations made. Moreover, it should be noticed that the image magnification will be ignored, without loss of generality. 1. The Exit Wave The first important step in the derivation of the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). This is a complex wave function in the plane at the exit face of the object, resulting from the interaction of the electron beam with the object. Use will be made of the simplified channelling theory. At this stage, structure parameters will enter the model. High-resolution CTEM images often show a one-to-one correspondence with the projected object structure if the incident electron beam propagates along a main zone axis. This happens for instance in ordered alloys with columnar structures provided that the point resolution of the microscope is suycient and the distance between adjacent columns is not too small

63 QUANTITATIVE ATOMIC RESOLUTION TEM 63 Figure 17. Schematic representation of electron channelling. (van Tendeloo and Amelinckx, 1978; van Tendeloo, and Amelinckx, 1982). From this, it has been suggested that for materials oriented along a main zone axis and with suycient separation between the columns, the exit wave mainly depends on the projected structure, that is, on the type of atom columns. The physical reason behind this is that the atoms are superimposed along an atom column in this orientation. Then, it can be shown that the electrons are trapped in the positive electrostatic potential of the atoms. Because of this, each atom column acts as a guide or a channel within which the electron scatters dynamically without leaving the column (van Dyck, 2002). This channelling evect is schematically represented in Figure 17. In the simplified channelling theory, applicable if the incident electron beam propagates along a main zone axis, an expression for the exit wave is given by (van Dyck and Op de Beeck, 1996): cðr; zþ ¼ 1 þ Xn c n¼1 c n f 1s;n ðr b n Þ exp ip E 1s;n 1 E 0 l z 1 ; ð84þ where r ¼ðxyÞ T is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the incident beam direction, z is the object thickness, E 0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related (Kirkland, 1998): hc l ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð85þ E 0 ð2m 0 c 2 þ E 0 Þ with h Plancks constant, m 0 the electron rest mass and c the velocity of light so that hc ¼ 12:398 kev and m 0 c 2 ¼ 511 kev. It should be mentioned that

64 64 VAN AERT ET AL. the accelerating voltage is equal to E 0 /e, where e ¼ 1: C is the electron charge. The summation in Eq. (84) is over n c atom columns. The function f 1s;n ðr b n Þ is the lowest energy bound state of the nth atom column located at position b n ¼ðb xn b yn Þ T and E 1s,n is its energy. The lowest energy bound state is a real-valued, centrally peaked, radially symmetric function, which is a two-dimensional analogue of the 1s-state of an atom. Following van Dyck and Op de Beeck (1996), it has been assumed that the dynamical motion of the electron in a column may be expressed primarily in terms of this tightly bound 1s-state. The other states are not neglected, but for thin objects they will not build up and are incorporated in the term 1 in Eq. (84), which describes the unscattered incident electron wave. The author is well aware of the fact that for heavy atom columns, where higher order states start to play a more prominent role (Kambe, Lehmpfuhl, and Fujimoto, 1974), Eq. (84) becomes a less accurate description of the exit wave (van Dyck and Op de Beeck, 1996). The excitation coeycients c n may be found from (van Dyck and Op de Beeck, 1996): Z c n ¼ f 1s;nð r b nþcðr; 0Þdr; ð86þ where the symbol * denotes the complex conjugate. For plane wave incidence, i.e., cðr; 0Þ ¼1, one thus has: Z c n ¼ f 1s;nð r b nþdr: ð87þ Following Geuens, Chen, den Dekker, and van Dyck (1999) and Geuens and van Dyck (2002), the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function f 1s;n ðþ¼ r p 1 ffiffiffiffiffi 2p a n exp r2 4a 2 n ; ð88þ where r is the Euclidean norm of the two-dimensional vector r, that is, r ¼jrj, and a n represents the column dependent width. This width is directly related to the energy of the 1s-state. Then, it follows from Eqs. (87) and (88) that p c n ¼ 2 ffiffiffiffiffi 2p an : ð89þ The two-dimensional Fourier transform F 1s;n ðgþ of Eq. (88), which will be needed in the remainder of this section, is given by: p F 1s;n ðgþ ¼ 2 ffiffiffiffiffi 2p an exp 4p 2 a 2 n g2 ð90þ with g being the Euclidean norm of the two-dimensional spatial frequency vector g in reciprocal space, that is, g ¼jgj. Throughout this article, the

65 QUANTITATIVE ATOMIC RESOLUTION TEM 65 two-dimensional Fourier transform H(g) of an arbitrary function h(r) is defined as Z HðgÞ ¼ = r!g hðþ¼ r hðþexp r ði2pg:rþdr; ð91þ where the symbol. denotes the scalar product. Consequently, the inverse Fourier transform is defined as: Z hðþ¼= r 1 g!r HðgÞ ¼ HðÞexp g ð i2pg:rþdg: ð92þ 2. The Image Wave In the second step of the derivation of the parametric statistical model of the observations, an expression for the image wave c i (r, z) is obtained. This is a complex electron wave function at the image plane. At this stage, most microscope settings will enter the model. The image wave is written as the convolution product of the exit wave with the point spread function t(r) of the electron microscope (van Dyck, 2002): c i ðr; zþ ¼ cðr; zþtðþ: r ð93þ The two-dimensional Fourier transform of t(r) represents the microscope s transfer function T(g). Following (van Dyck, 2002), T(g) is radially symmetric and described as: TðgÞ ¼ Tg ð Þ ¼ Ag ð ÞD s ðgþd t ðgþ expð iwðgþþ; ð94þ where A(g) is a circular aperture function, given by: ( Ag ð Þ ¼ 1 if g g ap ð95þ 0 if g > g ap with g ap the objective aperture radius. Notice that the objective aperture semiangle a o is equal to g ap l. In what follows, it will be assumed that there is no objective aperture so that A(g) is constant and equal to 1. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and given by: wðgþ ¼ p"lg 2 þ 1 2 pc sl 3 g 4 ð96þ with " being the defocus. Notice that higher order aberration evects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, have been neglected. They could be included in the phase shift as well (Thust, Overwijk, Coene,

66 66 VAN AERT ET AL. and Lentzen, 1996). In the quasi-coherent approximation, the evects of partial spatial and temporal coherence are incorporated by the damping envelope functions D s (g) andd t (g), respectively. For a Gaussian incoherent evective electron source, the function D s (g) is described as (Frank, 1973), (Spence, 1988): D s ðgþ ¼ exp a2 c pc sl 2 2! g 3 þ p"g ; ð97þ ln 2 where a c is the semi-angle of beam convergence. For a Gaussian spread of defocus, the function D t (g) is described as (Fejes, 1977):! D t ðgþ ¼ exp p2 l 2 D 2 g 4 ; ð98þ 2 where D is the defocus spread due to chromatic aberration, which is given by (O Keefe, 1992; Spence, 1988): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi D ¼ C c 4 DI 2 þ DV 2 þ DE 2 : ð99þ I 0 V 0 E 0 Notice that the defocus spread D, which is here definedpas the standard deviation, corresponds to a half width at 1/e height equal to ffiffi 2 D. In Eq. (99), C c is the chromatic aberration coeycient, DV and DI are the standard deviations of the statistically independent fluctuations of the accelerating voltage V 0 and objective lens current I 0, respectively, while DE is the intrinsic energy spread, that is, the standard deviation of the statistically independent fluctuations of the incident electron energy E 0 of the electrons in the electron source, defined as: Z 1 1=2 DE ¼ ðe E 0 Þ 2 pe ð ÞdE ; ð100þ 1 where p(e ) is the energy probability density function. It is usually assumed that p(e ) is well approximated by a Gaussian function:! pe ð Þ ¼ 1 DE ffiffiffiffiffi ð p exp E E 0Þ 2 2p 2ðDEÞ 2 ð101þ with expectation value E 0 and standard deviation DE. Straightforward calculations show that the relationship between the standard deviation DE and the full width at half maximum height of the energy distribution described by Eq. (101) is given by:

67 QUANTITATIVE ATOMIC RESOLUTION TEM 67 p FWHM ¼ ffiffiffiffiffiffiffiffiffi 8ln2DE 2:35DE: ð102þ In the following, it is assumed that DV/V 0 and DI/I 0 are small in comparison to DE/E 0, so that they may be neglected and Eq. (99) reduces to: DE D ¼ C c : ð103þ E 0 Notice that the quasi-coherent approximation used is only of a limited validity and is certainly not the state-of-the art to treat partial coherence. According to the work of Frank (1973), this approximation is only valid for a small evective source and a central unscattered beam much stronger than any other (Spence, 1988). A more correct analytical treatment may be achieved via autocorrelations in Fourier space, incorporating the microscope properties in the form of a transmission-cross-coeycient (Born and Wolf, 1999; Frank, 1973; Ishizuka, 1980). However, such a treatment would complicate the derivation of the optimal statistical experimental design and the explanation of the basic principles governing the obtained results severely and unnecessarily. Moreover, it should be mentioned that the analysis via transmission-cross-coeycients is also not perfect, since it does not take the influence of beam convergence and defocus spread on the scattering of the electrons with the object into account (van Dyck, 2002). 3. The Image Intensity Distribution Next, an expression for the image intensity distribution I(r) will be derived. This is given by the modulus square image wave. Hence, it follows from Eqs. (84) and (93) that IðÞ¼jc r i ðr; zþj 2 ( ¼ 1 þ Xn c c n f 1s;n ðr b n Þ exp ip E ) 1s;n 1 2 E 0 l z 1 tðþ r ; n¼1 ð104þ where it is taken into account that 1 * t(r) is equal to 1. Furthermore, f 1s;n ðr b n ÞtðÞ r ð105þ represents the 1s-state function convoluted with the microscope s point spread function, which is equal to 2p Z 1 0 F 1s;n ðgþtg ðþj 0 ð2pgjr b n jþgdg ð106þ since f 1s,n (r) and t(r) are both radially symmetric functions. In Eq. (106), J 0 (.) is the zeroth-order Bessel function of the first kind.

68 68 VAN AERT ET AL. Furthermore, notice that it can be seen from Eq. (104) that for identical atom columns, the contrast varies periodically with thickness, where the periodicity is given by (van Dyck and Chen, 1999a): D 1s ¼ 2E 0l ð107þ E 1s;n which is called the extinction distance. This periodic oscillation is due to dynamical evects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diverent for diverent types of atom columns. 4. The Image Recording Next, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. As a recording device, a CCD camera is chosen, consisting of K L equidistant pixels of area Dx Dy, where Dx and Dy are the sampling distances in the x- and y-direction, respectively. Pixel (k, l ) corresponds to position ðx k y l Þ T ðx 1 þðk 1Þ Dx y 1 þðl 1ÞDyÞ T of the recorded image, with k ¼ 1;...; K and l ¼ 1;...; L and ðx 1 y 1 Þ T represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is centered about (0 0) T.Itis chosen suyciently large so as to guarantee that the tails of the microscope s point spread function t(r) are collected. Furthermore, it is assumed that the quantum eyciency of the CCD camera is suyciently high to detect single electrons. The probability p kl that an electron hits a pixel (k, l ) is then approximately given by p kl ¼ I ð r klþ DxDy ð108þ I norm with I(r) given by Eq. (104), r kl ¼ðx k y l Þ T,andI norm a normalization factor given by: Z I norm ¼ IðÞdr; r ð109þ where the integral extends over the whole FOV. This means that for a given total number of detected electrons N, the number of electrons expected to be found at pixel (k, l ) is equal to: l kl ¼ Np kl : ð110þ This result defines the expectations of the observations w kl recorded by the detector and is hence called the expectation model. The total number of detected electrons N is equal to the total number of incident electrons, that

69 QUANTITATIVE ATOMIC RESOLUTION TEM 69 is, the number of electrons that interact with the object, since it has been assumed that there is no objective aperture. In the presence of an objective aperture, part of the electrons would be lost. The total number of incident electrons depends on the reduced brightness (B r ) of the electron source, the incident electron energy (E 0 ), the recording time (t), the field of view (FOV ), the semi-angle of beam convergence (a c ), and the electron charge (e ¼ 1: C), according to the formula (Spence, 1988): N ¼ B re 0 tfovpa 2 c : ð111þ e 2 The reduced brightness of the electron source is defined as the brightness of the electron source per accelerating voltage, whereas the brightness of the electron source describes the current density per unit solid angle of this source (Williams and Carter, 1996). In the absence of electron-electron interactions, the reduced brightness is a conserved quantity. This means that it is the same at every point on the optical axis (van Veen, Hagen, Barth, and Kruit, 2001). In what follows, the importance of this quantity on the performance of CTEM experiments will be studied. 5. The Incorporation of a Monochromator In this section, special attention is paid to the incorporation of a monochromator into the expectation model (den Dekker, van Aert, van Dyck, van den Bos, and Geuens, 2001). Suppose that a monochromator is incorporated in the imaging system below the electron source, removing all electrons, except those whose energy lies within a prespecified energy range ½E 0 de=2; E 0 þ de=2š. The monochromator reduces the standard deviation of the energy spread from DE, which is defined by Eq. (100), to DE m, which is described by: DE m ¼ Z E0 þde=2 E 0 de=2 1=2 ðe E 0 Þ 2 p 0 ðeþde! ð112þ with p 0 (E) being the energy distribution of the electrons transmitted by the monochromator, which is given by: 8 pe ð Þ >< R p 0 E0 þde=2 ðeþ ¼ E 0 de=2 pe ð ÞdE if E 0 de 2 E E 0 þ de 2 ð113þ >: 0 otherwise with p(e) defined as in Eq. (101). Straightforward calculations, using Eqs. (101), (112), and (113), then show that the standard deviation defining the

70 70 VAN AERT ET AL. energy spread of the electrons transmitted by the monochromator may be described as: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 DE m ¼ DE 1 de p E 0 de u t ð114þ Erf 2 pde ffiffi 2 2 DE with Erf(.) being the error function. As an unfavorable side evect of the incorporation of a monochromator, the total number of incident electrons that interact with the object reduces if the recording time t is kept constant. Only a fraction of the total number of electrons given by Eq. (111) will be recorded. It may be shown that the total number of detected electrons by use of a monochromator is given by: N ¼ B re 0 tfovpa 2 c e 2 ¼ B re 0 tfovpa 2 c e 2 Z E0 þde=2 E 0 de=2 de Erf p 2 ffiffiffi 2 DE pe ð ÞdE : ð115þ Hence, the expectation model by incorporating a monochromator is still given by Eq. (110), but with a reduced total number of electrons N as in Eq. (115) instead of as in Eq. (111) and a reduced energy spread of the electrons as in Eq. (114) instead of as in Eq. (100). For CTEM, the observations are electron counting results, which are supposed to be independent and Poisson distributed. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K L and the expectation model is given by Eq. (110). The parameter vector b ¼ðb x1...b xnc b y1... b ync Þ T consists of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, the optimal statistical experimental design of high-resolution CTEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the atom columns can be estimated. Therefore, the CRLB with respect to the position coordinates will be computed from the

71 QUANTITATIVE ATOMIC RESOLUTION TEM 71 parametric statistical model of the observations discussed in the previous section. In Section II, this CRLB was discussed. Then, a scalar measure of this CRLB, that is, a function of the elements of the CRLB, will be chosen as optimality criterion, which will then be evaluated and optimized as a function of the microscope settings. An overview of the microscope settings will be given in Section IV.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section IV.C.2, the results of the numerical evaluation of the dependence of the chosen optimality criterion on the microscope settings will be discussed. This will be done for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and, moreover, if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section IV.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Settings An overview of the microscope settings, which enter the parametric statistical model of the CTEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the illuminating electron beam are the electron wave-length l, the semi-angle of beam convergence a c, the standard deviation DE of the intrinsic energy spread of the electrons in the electron source, the reduced brightness B r of the electron source, and the width de of the energy selection slit (in the presence of a monochromator). The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. The evect of these settings on the precision with which atom column positions can be estimated will be studied. The semi-angle of beam convergence may be varied experimentally, but it will be held fixed and suyciently small in the present analysis in order to guarantee that the quasi-coherent approximation made in the derivation of the expectation model is reasonable. Moreover, typical values will be chosen for the standard deviation of the intrinsic energy spread of the electrons, in agreement with electron sources used today. The width of the energy selection slit will be variable, thus resulting into a variable energy spread DE m of the electrons.

72 72 VAN AERT ET AL. The microscope settings specifying the objective lens are the defocus ", the spherical aberration constant C s, and the chromatic aberration constant C c. The defocus will be variable. For most electron microscopes, the spherical and chromatic aberration constant are fixed properties of the microscope, however, by incorporating a spherical or chromatic aberration corrector, these settings are (or will become) tunable. Therefore, it is interesting to study the evect of these settings on the precision. The microscope settings describing the image recording are the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy will be kept constant. In agreement with the results presented in Section III, it may be shown that the precision will generally improve with smaller pixel sizes, with all other settings kept constant. However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel signal-tonoise ratio (SNR) decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, will be chosen fixed, but large enough so as to guarantee that the tails of the microscope s transfer function are collected in the FOV. 2. Numerical Results In this section, the results of the numerical evaluation of the dependence of the attainable precision, that is, the CRLB, on the microscope settings will be studied. This section is divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, optimal experimental designs for isolated atom columns will be computed. The corresponding highest attainable precisions will be compared to the attainable precisions at the original microscope settings. Third, the influence of neighboring atom columns on these optimal designs will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. They are related to the

73 QUANTITATIVE ATOMIC RESOLUTION TEM 73 comparison of the original and optimal microscope settings and to the structure parameters of the objects under study. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 7, unless otherwise mentioned. These values are typical for today s electron microscopes. In what follows, they will be compared to the optimal values which result into the highest attainable precision. In principle, the optimal values should be found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings. In this space, every point represents a set of values for the microscope settings of which the dimension is equal to the number of microscope settings. However, it has been found that, apart from the optimal defocus, the optimal value of each of these microscope settings is independent of the other settings. Consequently, the optimization of most microscope settings may be performed one at a time, instead of simultaneously. This kind of optimization is also justified from a practical point of view. Suppose, for example, that an experimenter has an electron microscope with spherical aberration corrector but without chromatic aberration corrector. This microscope will allow him or her to tune the spherical aberration constant, whereas the chromatic aberration constant is fixed. In this case, one is only interested in knowing the optimal spherical aberration constant for a given chromatic aberration constant, instead of knowing the combined optimal spherical and chromatic aberration constant. TABLE 7 Original Microscope Settings Microscope setting Value 10 4 a c (rad) DE(eV ) 0.75 B r ðam 2 sr 1 V 1 Þ C s (mm) 0.5 C c (mm) 1.3 Dx(Å) 0.2 Dy(Å) 0.2 K 100 L 100 t(s) 1

74 74 VAN AERT ET AL. In the following, the attainable precision will be computed as a function of the following microscope settings:. Defocus. Spherical aberration constant. Chromatic aberration constant. Energy spread of a monochromator. Reduced brightness of the electron source The evaluation of the precision as a function of the defocus will be done for a range of spherical aberration constants, for a given incident electron energy and corresponding electron wavelength. In this way, it will be possible to express the optimal defocus in terms of the spherical aberration constant and electron wavelength. The evaluation as a function of the other microscope settings will be performed separately. Moreover, microscopes operating at an incident electron energy of both 300 kev and 50 kev will be considered. Unless otherwise stated, the values of the microscope settings diverent from those to be optimized are given in Table 7 and the defocus is adjusted to its optimal value, which will be shown to be given, to a good approximation, by Eqs. (118)-(119). The results of the evaluation of the attainable precision as a function of the individual microscope settings will be presented in figures. In these figures, the point corresponding to the original microscope settings will be marked with a symbol. Use of the same symbol in diverent figures indicates that the corresponding microscope settings are identical. This makes comparison between diverent figures easier. The following three symbols with corresponding microscope settings will be used:. E d ¼ 300 kev, optimal defocus, other settings are given in Table 7.. E m ¼ 50 kev, C c ¼ 0 mm, optimal defocus, other settings are given in Table 7.. E j ¼ 50 kev, C s ¼ 0 mm, optimal defocus, other settings are given in Table 7. ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for both silicon [100] and gold [100] atom columns for which the width of the 1s-state and its energy are given in Tables 8 and 9 for a microscope operating at 300 kev and 50 kev, respectively. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts.

75 QUANTITATIVE ATOMIC RESOLUTION TEM 75 TABLE 8 Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 Å 2 and E 0 ¼ 300 kev) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter Si [100] Au [100] a n (Å) E 1s,n (ev) TABLE 9 Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 Å 2 and E 0 ¼ 50 kev) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter Si [100] Au [100] a n (Å) E 1s,n (ev) TABLE 10 Structure Parameters of an Isolated Atom Column Structure parameter Value b x (Å) 0 b y (Å) 0 E z(å) 0 l E 1s;n b. Isolated Atom Columns i. Structure Parameters. For isolated atom columns, the structure parameters other than the width of the 1s-state and its energy, that is, the atom column positions and the object thickness, are given in Table 10, unless otherwise stated. The object thickness is equal to half the extinction distance, which is given by Eq. (107). At this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions (Lentzen, Jahnen, Jia, Thust, Tillmann and Urban, 2002). ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinates b ¼ðb x b y Þ T can be measured. This attainable precision (in terms of the variance) is represented

76 76 VAN AERT ET AL. by the diagonal elements s 2 bx and s 2 by of the CRLB. An expression for these elements will be derived in the following paragraphs. For an isolated atom column, the CRLB is equal to the inverse of the 2 2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): F rs ¼ XK X L k¼1 l¼1 1 l s ð116þ with l kl the expected number of electrons at the pixel (k, l ). An expression for the elements F rs is found by substitution of the expectation model given by Eq. (110) as derived in Section IV.B and its derivatives with respect to the position coordinates into Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for F rs. For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another: s 2 b x ¼ s 2 b y ¼ F 1 ð117þ 11 with [F 1 ] 11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation s bx and s by, that is, the square root of the right-hand member of Eq. (117). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the defocus, the spherical aberration constant, the chromatic aberration constant, and the energy spread of a monochromator. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The constraint is either the radiation sensitivity of the object under study or the specimen drift. Therefore, either the incident electron dose per square Å or the recording time has to be kept within the constraints. iii. Optimal Defocus Value. First, the dependence of the precision on the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant and the electron wavelength. The precision is represented by the square root of the right-hand member of Eq. (117). In Figure 18, it is plotted for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant C s for a given electron wavelength l. Notice that the evaluation is done for positive as well as for negative C s -values. Negative C s -values may be obtained by use of a spherical

77 QUANTITATIVE ATOMIC RESOLUTION TEM 77 Figure 18. The lower bound on the standard deviation of the position coordinates of an isolated silicon atom column as a function of the spherical aberration constant and the defocus. The solid white curve is described by Eqs. (118) and (119) and the dotted white curve describes the numerically found optimal defocus values as a function of the considered spherical aberration constants. aberration corrector (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002), (Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). The solid white curve shown in Figure 18 is described by the relation rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi " ¼ 4 3 C sl if C s < 0; ð118þ rffiffiffiffiffiffiffiffiffiffiffi 4 " ¼ 3 C sl if C s 0; ð119þ where Eq. (119) is the well-known Scherzer defocus (Scherzer, 1949), which is generally believed to be optimal in terms of point resolution and contrast (Spence, 1988). The dotted white curve shown in Figure 18 describes the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted

78 78 VAN AERT ET AL. white curve in Figure 18, it follows that the Scherzer defocus (for positive C s ) and Eq. (118) (for negative C s ) are close to the optimal defocus values in terms of precision, except for values of C s that are significantly higher than the original setting of 0.5 mm. Moreover, for a given spherical aberration constant, operating at the corresponding optimal defocus instead of at the defocus described by Eqs. (118) or (119) is hardly beneficial. Therefore, the optimal defocus value, in terms of spherical aberration constant and electron wavelength, is approximately given by Eqs. (118) and (119). This result is in agreement with the results presented in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. Furthermore, this finding does not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. In Figure 18, the recording time as well as the number of incident electrons per square Å are fixed. The optimal defocus value does not change if, for example, longer recording times or more incident electrons per square Å would be allowed. The reason for this is that the precision is inversely proportional to the square root of the total number of detected electrons N, which, in its turn is directly proportional to the recording time. This follows from Eqs. (110), (111), (115), (116), and (117). Therefore, for other values of the recording time or the number of incident electrons per square Å, only the actual values for the standard deviation ascribed to Figure 18 would be diverent, whereas the optimal defocus value would be the same. From now on, the defocus will be adjusted to the value given by Eq. (118) for negative C s -values and to the Scherzer defocus, given by Eq. (119), for positive C s -values since these are useful approximations of the optimal defocus value. iv. Optimal Spherical Aberration Constant. Subsequently, the dependence of the precision on the spherical aberration constant is studied. Usually, the spherical aberration constant is a fixed property of the electron microscope. However, by incorporating a spherical aberration corrector, it is tunable and may range from the value of the original uncorrected microscope over zero and even to negative values (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002; Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). Thus far, the advantages of a spherical aberration corrector were usually discussed in the literature in terms of qualitative structure determination, that is, in terms of the possibility to perceive two atom columns separately in an image. The optimality criterion used was the point resolution r s of the electron microscope, which is equal to 0.66(C s l 3 ) 1/4. By use of a spherical aberration corrector, the point resolution improves and, consequently, structure-imaging artifacts due to

79 QUANTITATIVE ATOMIC RESOLUTION TEM 79 contrast delocalization reduce (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). In the present analysis, however, the possible benefit of a spherical aberration corrector is discussed in terms of the attainable statistical precision with which position coordinates of an atom column can be determined. This is the criterion of importance in the framework of quantitative structure determination, which will gain importance in the future. This criterion takes the object and the total number of detected electrons into account. First, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 kev, corresponding to an accelerating voltage of 300 kv and an electron wavelength of 0.02 Å. In Figure 19, it is plotted for a silicon [100] as well as for a gold [100] atom column as a function of the spherical aberration constant C s. The optimal spherical aberration constant in terms of precision is the one that corresponds to the minimum of the curve shown in Figure 19. From Figure 19, it follows that the optimal spherical aberration constant is equal to 0 mm in this example. For light atom columns such as silicon [100], the precision in terms of the standard deviation that is gained by reducing the spherical aberration constant from the original setting of 0.5 mm to the optimal setting of 0 mm is a factor of 1.3. For heavy atom columns such as gold [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.9. Therefore, correction of spherical aberration is more useful in terms of precision for heavy than for light atom columns. Notice, however, Figure 19. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 300 kev.

80 80 VAN AERT ET AL. that for silicon, it follows from Figure 18 that a comparable gain in precision as the mentioned factor of 1.3 may be obtained without spherical aberration corrector, by using a slightly diverent defocus value than Scherzer s. The same conclusion may be obtained for gold. Next, the previous evaluation has been repeated, but this time for a thinner object. The object thickness is assumed to be equal to 50 Å. The results are shown in Figure 20. From this figure, it is concluded that for thin objects, the optimal spherical aberration constant is diverent from 0 mm. The reason for this is that for the thin object considered, a spherical aberration constant equal to 0 mm and a defocus adjusted to Scherzer s lead to images with very low contrast, which result into extremely high standard deviations of the position coordinates. This is also found in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. In this paper, intuitive interpretations of the results may be found. For a gold [100] atom column, the optimal spherical aberration constant is close to but diverent from 0 mm, whereas for a silicon [100] atom column, it is negative and equal to 0.35 mm. Therefore, from the comparison of Figures 19 and 20, it is concluded that the optimal spherical aberration constant clearly depends on the object under study. This finding is in contrast to what is found in Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban (2002), where expressions are derived for the optimal spherical aberration constant in terms of phase Figure 20. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 300 kev. The object thickness is equal to 50 Å.

81 QUANTITATIVE ATOMIC RESOLUTION TEM 81 contrast and delocalization. The obtained expressions do not depend on structure parameters of the object under study. Subsequently, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 kev, instead of 300 kev, corresponding to an accelerating voltage of 50 kv and an electron wavelength of 0.05 Å. Usually, decreasing the incident electron energy, or equivalently, increasing the electron wavelength, is not beneficial in terms of precision if the relevant physical constraint of the experiment is determined by the specimen drift. Some of the reasons for the deterioration of the precision with decreasing incident electron energy are the accompanied decrease of the number of detected electrons, which follows directly from Eq. (111), and the deterioration of the point resolution r s ¼ 0:66ðC s l 3 Þ 1=4. However, for some materials one should use incident electron energies lower than 300 kev in order to avoid displacement damage, that is, displacement of atoms from their initial positions. The amount of displacement damage decreases with decreasing incident electron energy (Williams and Carter, 1996). Examples of materials which are sensitive to displacement damage are metals and amorphous materials. Although silicon and gold are possibly insensitive to displacement damage, the evaluation of the attainable precision is again performed for these columns so as to make comparison with the 300 kev results possible. The results for 50 kev are shown in Figure 21. In this evaluation, the chromatic aberration constant is equal to 0 mm. From Figure 21. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 50 kev. A chromatic aberration constant is used with C c = 0 mm.

82 82 VAN AERT ET AL. Figure 21, it follows that the optimal spherical aberration constant is equal to 0 mm, just as for a microscope operating at an incident electron energy of 300 kev and an object thickness equal to half the extinction distance. Moreover, it is concluded that, both for light and for heavy atom columns, correction of the spherical aberration is useful in terms of precision, although the gain is higher for heavy than for light atom columns. For example, for a light atom column, such as silicon [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 2.5, whereas for a heavy atom column such as gold [100], this is a factor of The latter is a substantial reduction of the standard deviation. From the comparison of the numerical values of the lower bound on the standard deviation of the position coordinates corresponding to 50 kev and 300 kev, it follows that, as predicted above, the precision is higher for 300 kev than for 50 kev if the recording time is fixed. Therefore, reducing the incident electron energy is only beneficial in terms of precision if the object under study is sensitive to displacement damage. In the discussion of the optimal spherical aberration constant, some remarks are due. It should be mentioned that the results of the optimal spherical aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figures 19 to 21, the recording time as well as the number of incident electrons per square Å are fixed. Furthermore, it should be mentioned that the possible benefit of a spherical aberration corrector, which allows one to reduce the spherical aberration constant, is underestimated in the present analysis due to the following reason. The semi-angle of beam convergence has been kept constant and suyciently small in order to guarantee that the quasi-coherent approximation, made in the derivation of the expectation model, is reasonable (Spence, 1988). The chosen angle does therefore not correspond to the optimal value in terms of attainable precision. In the quasi-coherent approximation, the evects of partial spatial and temporal coherence are incorporated by coherent damping envelope functions. For large semi-angles of beam convergence, the quasi-coherent approximation is no longer valid. A better approximation would be to include partial spatial and temporal coherence in the expectation model in the form of transmission-crosscoeycients (Frank, 1973), (Born and Wolf, 1999; Ishizuka, 1980). This model would allow one to evaluate and optimize the attainable precision as a function of the semi-angle of beam convergence. Although such an analysis is not made in this work, it is intuitively clear that the optimal semi-angle of beam convergence would increase with decreasing spherical aberration constant and that the relative gain in precision would increase accordingly. This intuitive reasoning is based on the facts mentioned in Kabius, Haider,

83 QUANTITATIVE ATOMIC RESOLUTION TEM 83 Uhlemann, Schwan, Urban, and Rose (2002) and on the expectation model given by Eq. (110), although it is of a limited validity. From Eqs. (111) and (115), it follows that the total number of detected electrons increases with increasing semi-angle of beam convergence, which has a favorable evect on the attainable precision with which position coordinates can be estimated. As a side evect, however, it follows from Eq. (97) that with increasing semi-angle of beam convergence, high spatial frequencies are more severely attenuated due to partial spatial coherence, which has an unfavorable evect on the attainable precision. The optimal semi-angle of beam convergence is the one for which both evects are balanced so as to produce the highest attainable precision. The relative importance of the attenuation of high spatial frequencies becomes less for lower values of spherical aberration constant as follows from Eq. (97). Therefore, the optimal semi-angle of beam convergence will shift to higher values with decreasing spherical aberration constant. Due to the accompanied increase of the total number of detected electrons, the relative gain in precision will increase accordingly. Nevertheless, a decisive answer to the questions which semi-angle of beam convergence is optimal and what precision may be gained can only be provided by means of further research. v. Optimal Chromatic Aberration Constant. Next, the dependence of the precision on the chromatic aberration constant is studied. Usually, the chromatic aberration constant is a fixed property of the electron microscope. However, by incorporating a chromatic aberration corrector, which is at a conceptual stage (Weißbäcker and Rose, 2001, 2002), it will become tunable and may even become negative. The advantages of a chromatic aberration corrector for use in CTEM experiments are usually discussed in the literature in terms of the information limit of the electron microscope. The information limit r i is equal to (pld/2) 1/2, with D the defocus spread, which is proportional to the chromatic aberration constant (Spence, 1988). By use of a chromatic aberration corrector, the information limit improves. In combination with image processing techniques such as ov-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced, which is a benefit for qualitative structure determination. In the present analysis, the performance of a chromatic aberration corrector is studied for quantitative structure determination aiming at the highest precision with which position coordinates of an atom column can be estimated. First, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 300 kev. In Figure 22, it is plotted for a silicon [100] as well as for a gold [100] atom column as a function of the chromatic

84 84 VAN AERT ET AL. Figure 22. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 300 kev. aberration constant. From this figure, it follows that the optimal chromatic aberration constant is equal to 0 mm. The precision in terms of the standard deviation that is gained by reducing the chromatic aberration constant from the original setting of 1.3 mm to 0 mm is a factor of 1.1 and 1.4 for silicon [100] and gold [100], respectively. Hence, both for light and for heavy atom columns, correction of the chromatic aberration is not so useful in terms of precision under the given conditions. Second, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 50 kev, instead of 300 kev. Figure 23 shows the results of the evaluation for a silicon [100] as well as for a gold [100] atom column. The spherical aberration constant is equal to 0 mm. From this figure, it follows that the optimal chromatic aberration constant is again equal to 0 mm. Compared to the results obtained for a microscope operating at an incident electron energy of 300 kev, correction of the chromatic aberration is more useful in terms of precision both for light and for heavy atom columns. The precision that is gained by reducing the chromatic aberration constant from 1.3 mm to 0 mm is a factor of 3.5 and 22.0 for a light atom column such as silicon [100] and for a heavy atom column such as gold [100], respectively. These are substantial reductions of the standard deviation. However, as mentioned earlier, decreasing the incident electron energy is only recommended for materials which are sensitive to displacement damage.

85 QUANTITATIVE ATOMIC RESOLUTION TEM 85 Figure 23. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 50 kev. A spherical aberration corrector is used with C s = 0 mm. The results of the optimal chromatic aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figure 22 and 23, the recording time as well as the number of incident electrons per square Å are fixed. vi. Optimal Energy Spread of a Monochromator. Furthermore, the precision of the position coordinate estimates is evaluated and optimized as a function of the energy spread of a monochromator. This evaluation and optimization will be done for a fixed recording time as well as for a fixed number of incident electrons per square Å. In the former case, the physical constraint is determined by the specimen drift whereas in the latter one, it is determined by the radiation sensitivity of the object. The reason for considering both constraints is that the number of incident electrons per second decreases by use of a monochromator. The use of a monochromator in CTEM experiments is assumed to be advantageous for qualitative structure determination. The reason for this supposition is that the information limit r i, which is equal to (pld/2) 1/2, improves by use of a monochromator because of the decrease of the defocus spread D. This means that, in combination with ov-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced. In these discussions, the object under study or the total number of detected electrons are not taken into account. However, a reduction of

86 86 VAN AERT ET AL. the incident number of electrons per second leads to a decrease in SNR if the recording time is kept within the constraints. This evect has to be taken into account when the performance of a monochromator for quantitative structure determination is evaluated. This might be done by using a modified definition of the information limit that includes the SNR (de Jong and van Dyck, 1993; van Dyck and de Jong, 1992). In the present analysis, however, this is done by choosing the attainable precision, instead of the information limit, as optimality criterion. This criterion takes both the object under study and the total number of detected electrons into account. First, it is assumed that the specimen drift determines the relevant physical constraint. Hence, the recording time is kept constant in the evaluation of the precision as a function of the standard deviation DE m of the energy spread of the monochromator, given by Eq. (114). Consequently, the total number of detected electrons decreases with decreasing energy spread. This follows directly from Eq. (115). Figures 24 and 25 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 kev and 50 kev, respectively. At 50 kev, the spherical aberration constant is set to 0 mm. The optimal value of the energy spread in terms of precision is the one that corresponds to the minimum of the curve. From Figure 24, where the incident electron energy is equal to 300 kev, it follows that, for light atom columns such as silicon [100], no precision is gained by decreasing the energy spread by means of a monochromator. On the other hand, for heavy Figure 24. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 kev. In this evaluation, the recording time is kept constant.

87 QUANTITATIVE ATOMIC RESOLUTION TEM 87 Figure 25. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 kev. A spherical aberration corrector is used with C s = 0 mm. In this evaluation, the recording time is kept constant. atom columns such as gold [100], a monochromator may slightly improve the precision. The precision that is gained by reducing the intrinsic energy spread of 0.75 ev to the optimal energy spread of 0.45 ev by use of a monochromator is a factor of 1.1 in this example. From Figure 25, it follows that, for a microscope operating at an incident electron energy of 50 kev, both for light and heavy atom columns, the precision improves by use of a monochromator. For a silicon [100] column and a gold [100] column, the precision that is gained by reducing the intrinsic energy spread of 0.75 ev to the optimal energy spread of 0.13 ev and 0.02 ev using a monochromator is a factor of 1.7 and 5.5, respectively. Second, it is assumed that the radiation sensitivity of the object determines the relevant physical constraint. Hence, the number of incident electrons per square Å is kept constant in the evaluation of the precision as a function of the standard deviation of the energy spread. In practice, it follows from Eq. (115) that this may be realized by compensating the loss of incident electrons due to the use of the monochromator with an increasing recording time. Figures 26 and 27 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 kev and 50 kev, respectively. The recording time corresponding to an intrinsic energy spread of 0.75 ev is equal to 1 s. At 50 kev, the spherical aberration constant is equal to 0 mm. From these figures, it follows that under the given conditions, the precision

88 88 VAN AERT ET AL. Figure 26. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 kev. In this evaluation, the number of incident electrons per square Å is kept constant. Figure 27. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 kev. A spherical aberration corrector is used with C s ¼ 0 mm. In this evaluation, the number of incident electrons per square Å is kept constant. improves by use of a monochromator. The precision that is gained is larger for heavy atom columns such as gold [100] and for smaller incident electron energies. This may be illustrated by the following numerical values. The precision that is gained by reducing the intrinsic energy spread of 0.75 ev to

89 QUANTITATIVE ATOMIC RESOLUTION TEM ev using a monochromator for a microscope operating at an incident electron energy of 300 kev is a factor of 1.1 and 1.4 for a silicon [100] and gold [100] column, respectively. For a microscope operating at an incident electron energy of 50 kev, these factors are substantial and equal to 3.5 and 21.1, respectively. vii. Optimal Reduced Brightness of the Electron Source. Next, the evect of the reduced brightness B r of the electron source on the precision with which the position coordinates of an atom column can be measured is studied. Using Eqs. (110), (116), and (117), it follows that the precision, represented by the lower bound on the standard deviation of the position coordinates, is inversely proportional to the square root of the total number of detected electrons N. Furthermore, it follows from Eqs. (111) and (115) that in the absence as well as in the presence of a monochromator, N is directly proportional to the reduced brightness of the electron source B r. Therefore, new developments in producing electron sources with higher reduced brightness (de Jonge, Lamy, Schoots, and Oosterkamp, 2002; van Veen, Hagen, Barth, and Kruit, 2001) are advantageous in terms of precision. For example, if the reduced brightness is increased by a factor of 10, the lower bound on the p ffiffiffiffiffistandard deviation of the position coordinates decreases by a factor of 10. Hence, on the one hand, if the experiment is limited by specimen drift, the optimal reduced brightness is preferably as high as possible, that is, as high as physical limitations to the production of electron sources with higher reduced brightness allow. The dominant limitation is determined by the statistical Coulomb interactions (Kruit and Jansen, 1997; van Veen, Hagen, Barth, and Kruit, 2001). On the other hand, if the experiment is limited by the radiation sensitivity of the object, the reduced brightness has to be kept subcritical or an increase of the reduced brightness B r has to be kept subcritical or an increase of the reduced brightness B r has to be compensated by a decrease of the recording time t,so as to keep the number of incident electrons per square Å within the constraints. Finally, a remark about the recording time needs to be made. If the experiment is limited by specimen drift, the recording time is kept within the constraints in this study. The amount of specimen drift is determined by mechanical instabilities of the specimen holder. Hence, new developments providing more stable specimen holders, would allow microscopists to increase the recording time. This has a favorable evect on the precision since, as mentioned above, the lower bound on the standard deviation of the position coordinates is inversely proportional to the square root of the total number of detected electrons N, which in its turn is directly proportional to the recording time.

90 90 VAN AERT ET AL. viii. Summary. Tables 11 and 12 give a summary of the attainable precisions with which the position coordinates of an isolated atom column can be estimated for a microscope operating at an incident electron energy of 300 kev and 50 kev, respectively. The attainable precision is represented for the values of the original microscope settings as described in Table 7 and for the optimal values of one or two of these settings with all other values kept fixed. The defocus is adjusted to the value given by Eq. (118) for negative C s -values and to the Scherzer defocus, given by Eq. (119), for positive C s -values. These values are close to optimal. This is done for both a silicon [100] and gold [100] atom column for which the structure parameters are given in Tables 8, 9, and 10. The recording time is held constant. From these tables, the following conclusions are drawn: TABLE 11 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 300 kev Column type Microscope settings Si [100] Au [100] original settings Å Å optimal spherical aberration constant (C s ¼ 0 mm) Å Å optimal chromatic aberration constant (C c ¼ 0 mm) Å Å optimal energy spread of the monochromator (see text) Å Å 10 higher reduced brightness Å Å optimal spherical and chromatic aberration constant Å Å optimal spherical aberration constant and Å Å optimal energy spread of the monochromator TABLE 12 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 50 kev Column type Microscope settings Si [100] Au [100] original settings Å Å optimal spherical aberration constant (C s ¼ 0 mm) Å Å optimal chromatic aberration constant (C c ¼ 0 mm) Å Å optimal energy spread of the monochromator (see text) Å Å 10 higher reduced brightness Å Å optimal spherical and chromatic aberration constant Å Å optimal spherical aberration constant and Å Å optimal energy spread of the monochromator

91 QUANTITATIVE ATOMIC RESOLUTION TEM 91. The attainable precision is better at 300 kev than at 50 kev. Hence, reducing the incident electron energy is only recommended if the experiment is limited by displacement damage instead of specimen drift.. Mathematically speaking, at 300 kev, the attainable precision improves with a spherical or chromatic aberration corrector. However, since the accompanied gain in precision is only marginal, one may wonder if such correctors are needed in order to obtain a prespecified precision of the atom column positions.. At 50 kev, the attainable precision improves with a spherical or chromatic aberration corrector. A chromatic aberration corrector is preferable.. The attainable precision improves more with a chromatic aberration corrector than with a monochromator.. The attainable precision improves substantially if the reduced brightness would be 10 times higher.. The attainable precision improves substantially with both a spherical and chromatic aberration corrector, especially for heavy atom columns and low incident electron energies. Furthermore, as mentioned earlier, the attainable precision may be improved if the mechanical stability of the specimen holder is improved, since it would provide longer recording times and hence more detected electrons. c. Neighboring Atom Columns. The optimal microscope settings described in the previous part of Section IV.C.2 are derived for single isolated atom columns. One should keep in mind that the attainable precision with which the position of a single isolated column can be estimated is a valid criterion for the optimization of the experimental design as long as neighboring columns are clearly separated in the image. Under this condition, the attainable precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. This condition was not always met in the previous part. For example, images of silicon [100] atom columns of a crystal, taken with a microscope which operates at an incident electron energy of 50 kev and which is not corrected for spherical and chromatic aberration, show strong overlap. Then, the attainable precision with which the position of an atom column can be estimated is avected unfavorably by the presence of neighboring columns. To find out if the optimal microscope settings change in the presence of neighboring atom columns, the attainable precision with which atom column position coordinates of silicon [100] and gold [100] crystals can be estimated, will be computed.

92 92 VAN AERT ET AL. i. Structure Parameters. The two-dimensional projected structure of the objects under study, which are, silicon [100] and gold [100] crystals, is modelled as a lattice consisting of 7 7 projected atom columns at the positions T T; b n ¼ b xn b yn ¼ nx d n y d ð120þ with indices n ¼ðn x ; n y Þ; n x ¼ 3;...; 3, n y ¼ 3;...; 3, and d the distance between an atom column and its nearest neighbor. The values of the distance d for both a silicon [100] and a gold [100] crystal (International Centre for DiVraction Data, 2001) and for the object thickness are given in Table 13. It should be mentioned that the chosen object thickness is equal to 50 Å instead of half the extinction distance such as for isolated atom columns in the previous section. ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinate b xn of the central atom column of the lattice consisting of 7 7 atom columns can be estimated. This column corresponds to the index n ¼ð0; 0Þ. The attainable precision (in terms of the variance) is represented by the diagonal element s 2 b xn of the CRLB. An expression for this element may be derived as follows. First, the Fisher information matrix associated with the total set of 98 position coordinates b xn and b yn is computed. This is a matrix. The expression for the elements F rs of the Fisher information matrix is given by Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for F rs. Next, the CRLB is computed. It is given by the inverse of the Fisher information matrix. Finally, the diagonal element s 2 b xn of the CRLB, corresponding to the position coordinate b xn of the central atom column of the lattice, represents the attainable precision. In what follows, the precision will be represented by the lower bound on the standard deviation s bxn, that is, the square root of s 2 b xn. TABLE 13 Structure Parameters of Neighboring Atom Columns Column type Structure parameter Si [100] Au [100] d (Å) z (Å) 50 50

93 QUANTITATIVE ATOMIC RESOLUTION TEM 93 It will be used as optimality criterion for the evaluation and optimization of the experimental design. Alternatively, one could choose the lower bound on the standard deviation s byn of the position coordinate b yn of the central atom column since s bxn and s byn are equal to one another. The reason for this is that, for the chosen structure of the objects under study, rotation of the expectation model over an angle of 90 degrees carries the expectation model into itself. Moreover, the central atom column is preferred rather than one of the other 48 atom columns since this column is mostly avected by the presence of neighboring columns. As mentioned in Section II.C.2., the chosen criterion may be regarded as a partial or truncated optimality criterion. iii. Optimal Microscope Settings. First, in Figures 28 and 29, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 kev for a silicon [100] and gold [100] crystal, respectively. The solid curve corresponds to a microscope without correction for chromatic aberration, that is, a microscope without chromatic aberration corrector and monochromator. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is chosen equal to ev corresponding to a typical full width at half maximum height of 200 mev as follows from Eq. (102) (Batson, 1999). In this Figure 28. The lower bound on the standard deviation of the position coordinates of the central atom column of the 50 Å thick silicon [100] crystal under study as a function of the spherical aberration constant for a microscope operating at 300 kev equipped with or without chromatic aberration corrector or monochromator.

94 94 VAN AERT ET AL. Figure 29. The lower bound on the standard deviation of the position coordinates of the central atom column of the 50 Å thick gold [100] crystal under study as a function of the spherical aberration constant for a microscope operating at 300 kev equipped with or without chromatic aberration corrector or monochromator. evaluation, it is assumed that the specimen drift is the relevant physical constraint. Hence, the recording time is kept constant. It should be noticed that the precision is not represented for a spherical aberration constant equal to 0 mm in Figures 28 and 29. The reason for this is that for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the contrast in the image is very low, which results in extremely high standard deviations of the position coordinates. From Figures 28 and 29, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 300 kev:. The optimal spherical aberration constant is close to, but diverent from, 0 mm. The reason for this finding is due to the small object thickness.. The attainable precision improves by use of a chromatic aberration corrector. Particularly for light atom columns such as silicon [100], the gain in precision is only marginal.. The attainable precision deteriorates by use of a monochromator with an energy spread of ev for both types of crystals.. Strictly speaking, the highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diverent from, 0 mm (for the thin objects considered). However, the precision that is gained is only marginal. Hence, the question may be raised if this gain is required to obtain a desired precision.

95 QUANTITATIVE ATOMIC RESOLUTION TEM 95 It should be noticed that the possible benefit of a spherical aberration corrector is underestimated in the present analysis for the same reason as has been mentioned in the discussion of the evaluation of the spherical aberration constant for isolated atom columns. Second, in Figures 30 and 31, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 kev, instead of 300 kev, for a silicon [100] and gold [100] crystal, respectively. Again, the solid curve corresponds to a microscope without correction for chromatic aberration. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is equal to ev. The recording time is kept constant in the evaluation. Also here, the precision is not represented for a spherical aberration constant equal to 0 mm since for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the corresponding standard deviations of the position coordinates are very high. From Figures 30 and 31, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 50 kev:. The optimal spherical aberration constant is diverent from 0 mm. The reason for this finding is due to the small object thickness. Figure 30. The lower bound on the standard deviation of the position coordinates of the central atom column of the 50 Å thick silicon [100] crystal under study as a function of the spherical aberration constant for a microscope operating at 50 kev equipped with or without chromatic aberration corrector or monochromator.

96 96 VAN AERT ET AL. Figure 31. The lower bound on the standard deviation of the position coordinates of the central atom column of the 50 Å thick gold [100] crystal under study as a function of the spherical aberration constant for a microscope operating at 50 kev equipped with or without chromatic aberration corrector or monochromator.. The attainable precision improves by use of either a chromatic aberration corrector or a monochromator, although a chromatic aberration corrector is preferred.. The attainable precision improves more with a chromatic than with a spherical aberration corrector. The gain is more significant for heavy atom columns such as gold [100] than for light atom columns such as silicon [100].. The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diverent from, 0 mm (for the thin objects considered). The gain in precision is substantial. From the comparison of the conclusions obtained from Figures 28 to 31 for neighboring atom columns with those obtained for isolated atom columns as summarized in the previous section, it follows that the main conclusions regarding the optimal microscope settings remain. Moreover, like for isolated atom columns, increasing the reduced brightness of the electron source and improving the mechanical stability of the specimen holder is advantageous in terms of precision if the experiment is limited by specimen drift. This is evident since these conclusions, which are given earlier, are only based on the total number of detected electrons and not on the structure of the object under study.

97 QUANTITATIVE ATOMIC RESOLUTION TEM 97 d. Attainability of the Cramér-Rao Lower Bound. Finally, the discussion of the optimization of a CTEM experiment should be complemented with an investigation if there exists an estimator attaining the CRLB on the variance of the position coordinates and if this estimator is unbiased. If so, this would justify the choice of the CRLB as optimality criterion used in this section. Generally, one may use diverent estimators in order to measure the position coordinates of the atom columns from CTEM experiments such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diverent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in CTEM experiments is finite and may even be relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 200 diverent CTEM experiments made on an isolated silicon [100] atom column are simulated; the observations are modelled using the parametric statistical model described in Section IV.B. The spherical aberration constant is set equal to 1 mm. Next, the position coordinates b x and b y of the atom column are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression given by the right-hand member of Eq. (117). The results are presented in Table 14. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Furthermore, the maximum likelihood estimates of b x are presented in the histogram of Figure 32. The solid curve represents a normal distribution with mean and variance given in Table 14. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eycient estimates. These results justify the choice of the CRLB as optimality criterion.

98 98 VAN AERT ET AL. TABLE 14 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 200 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate (Å) Estimated mean (Å) Standard deviation of mean (Å) b x 0 b x b y 0 b y Lower bound on variance (Å 2 ) Estimated variance (Å 2 ) Standard deviation of variance (Å 2 ) s 2 b x s 2 b x s 2 b y s 2 b y The numbers of the last column represent the estimated standard deviation of the variable of the previous column. Figure 32. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of an atom column. The normal distribution superimposed on this histogram, with mean and variance given in Table 14, makes plausible that the estimates are normally distributed.

99 QUANTITATIVE ATOMIC RESOLUTION TEM Interpretation of the Results To provide more insight, an intuitive interpretation will be given to some numerical results obtained in Section IV.C.2. This will be done at the hand of a result obtained in Section III where a rule of thumb was obtained for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM. The rule of thumb, which is given by Eq. (68), was derived for an expectation model of the observations consisting of a constant background from which a Gaussian peak was subtracted. From it, one observes that the attainable precision is a function of the width of the Gaussian peak and the total number of detected electrons. Generally, the precision will improve by narrowing the Gaussian peak and by increasing the total number of detected electrons. Empirically, it has been found that the obtained rule of thumb is generalizable to more complicated CTEM expectation models than Gaussian peaks. Two diverent approaches may be followed. One approach is to consider the highest spatial frequency that is transferred from the exit plane to the image plane instead of the inverse of the width of the Gaussian peak. Another approach is to consider the width associated with the peak which remains if the background is subtracted from the CTEM expectation model instead of the width of the Gaussian peak. The generalized rule of thumb is then that the precision will improve by decreasing the width of this peak or by increasing the highest spatial frequency that is transferred from the exit plane to the image plane and by increasing the total number of detected electrons. From the example illustrated in Figure 28, it may be concluded that the lower bound on the standard deviation of the position coordinates of a silicon [100] atom column of a crystal is a factor of 2.4 lower by using a chromatic aberration constant of 0 mm and a spherical aberration constant of 0.2 mm instead of an energy spread of the monochromator of ev and a spherical aberration constant of 1.0 mm. This result may intuitively be interpreted by comparing the corresponding expectation models. These models describe the expected number of electrons detected at the pixels of a CCD camera. It has been derived in Section IV.B. Figure 33 shows intersections of the two-dimensional, radially symmetric column model and a plane through its radial axis. It is clearly observed that the peak, which remains if the background is removed, is narrower if the spherical aberration constant is equal to 0.2 mm instead of 1.0 mm. This narrowing is directly related to the improvement of the point resolution r s ¼ 0:66ðC s l 3 Þ 1=4 with decreasing spherical aberration constant. Moreover, the number of detected electrons is much larger in the absence of a monochromator since it is assumed in this example that the recording time is

100 100 VAN AERT ET AL. Figure. 33. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated silicon [100] atom column and a plane through its radial axis. The solid curve corresponds to a microscope with chromatic and spherical aberration constant equal to 0 mm and 0.2 mm, respectively. The dashed curve corresponds to a microscope with standard deviation of the energy spread of the monochromator and spherical aberration constant equal to ev and 1.0 mm, respectively. fixed. These considerations give an intuitive interpretation to the conclusion drawn from Figure 28. Moreover, it follows from Figure 24 to 31 that the precision that is possibly gained by use of a monochromator is higher for heavy atom columns such as gold [100] than for light atom columns such as silicon [100] and for microscopes operating at lower incident electron energies, for example, 50 kev instead of 300 kev. These results may be explained on a more or less intuitive basis as follows. Figures 34 and 35 show the damping envelope function D t (g) due to partial temporal coherence, described by Eq. (98), associated with an electron source having an intrinsic energy spread equal to 0.75 ev, together with the Fourier transformed 1s-state functions F 1s,n (g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy equal to 300 kev and 50 kev, respectively. It follows from Eqs. (104) (106) that F 1s,n (g) may be regarded as the object transfer function associated with the atom column. It acts as a low pass filter and severely attenuates the amplitude of the microscope s transfer function T(g) at high spatial frequencies. The microscope s transfer function is described by Eq. (94) and it includes the damping envelope function D t (g). The bandwidth of the low pass filter F 1s,n (g) associated with the atom column depends on the

101 QUANTITATIVE ATOMIC RESOLUTION TEM 101 Figure 34. The damping envelope function D t (g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F 1s,n (g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 300 kev. Figure 35. The damping envelope function D t (g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F 1s,n (g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 50 kev. weight of this column. Heavy atom columns have more sharply peaked 1sstate functions, and therefore wider object transfer functions, than light atom columns (see also Tables 8 and 9). Consequently, a silicon [100] atom column will have a narrower bandwidth than a gold [100] atom column, as can be seen in Figures 34 and 35. A reduction of the energy spread, which may be obtained by incorporating a monochromator, decreases the

102 102 VAN AERT ET AL. information limit since it increases the band-width of the damping envelope function D t (g). However, pushing the inverse of the information limit of the microscope beyond the bandwidth of the object transfer function is useless. From Figures 34 and 35, it follows that there is more object spatial frequency information to be gained at an incident electron energy of 50 kev instead of 300 kev and for a gold [100] atom column than for a silicon [100] atom column. This evect, in combination with the loss of electrons by use of a monochromator if the experiment is limited by the specimen drift or the non-loss if the experiment is limited by the radiation damage, makes the results obtained from Figures 24 to 31 understandable. The same reasoning may be applied to understand that the optimal chromatic aberration constant is equal to 0 mm and that a chromatic aberration corrector improves the precision more for heavy than for light atom columns and for lower incident electron energies as follows from Figures 22, 23, and 28 to 31. The chromatic aberration corrector increases the bandwidth of the damping envelope function D t (g) like the monochromator does, but this is not accompanied by a reduction of the total number of detected electrons. The examples given above illustrate that the rule of thumb derived in Section III, for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM, may be used to give an intuitive interpretation to the numerical results obtained in Section IV.C.2. This provides a check of these numerical results. However, this rule of thumb cannot replace the exact expressions for the attainable precision which have been used in Section IV.C.2. D. Conclusions It has been shown that when it comes to the evaluation and optimization of quantitative CTEM experiments aiming at the highest precision, criteria such as point resolution and information limit may give rise to deceptive guidelines, since they do not take the object and total number of detected electrons into account. Alternatively to these criteria, the obvious optimality criterion is the attainable statistical precision, that is, the CRLB, with which position coordinates of atom columns can be estimated. This criterion depends on the microscope settings, the object, and the total number of detected electrons, rather than on the microscope settings alone. An expression for the attainable statistical precision has been derived from a parametric statistical model of the observations. The expectations of the observations have been described by means of the channelling theory and the quasi-coherent approximation, whereas the fluctuations of the

103 QUANTITATIVE ATOMIC RESOLUTION TEM 103 observations have been described by means of Poisson statistics. The obtained expression has been used to evaluate and optimize the design of quantitative CTEM experiments. This analysis has been made for microscopes operating at an intermediate incident electron energy of 300 kev and for those operating at a low incident electron energy of 50 kev. The relevant physical constraints have been taken into consideration. These constraints are the radiation sensitivity of the object or the specimen drift. Therefore, the incident electron dose per square Å or the recording time has been kept within the constraints. From the analysis, the following general guidelines have been derived:. The optimal defocus value is approximately given by Eq. (118) for negative C s -values and at the Scherzer defocus, given by Eq. (119), for positive C s -values.. A spherical and chromatic aberration corrector may improve the attainable precision. The precision that is gained depends on the object under study. Correction has more sense for low than for intermediate incident electron energies and for objects consisting of heavy instead of light atom columns. It should be mentioned that the optimal spherical aberration constant is diverent from 0 mm for thin objects.. The attainable precision improves more with a chromatic aberration corrector than with a monochromator.. The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with both a spherical and chromatic aberration corrector.. Increasing the reduced brightness of the electron source may improve the attainable precision substantially if the experiment is limited by specimen drift.. Improving the mechanical stability of specimen holders, which would provide longer recording times, improves the attainable precision, especially if the experiment is limited by specimen drift. Additionally, the following guidelines have been derived for microscopes operating at intermediate incident electron energies:. A monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift, except for heavy atom columns, whereas it slightly improves the precision if the experiment is limited by the radiation sensitivity of the object.. The precision that is possibly gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be disillusioning in the sense that this gain is only marginal and might not be needed to obtain a required precision.

104 104 VAN AERT ET AL. Furthermore, the following guidelines have been derived for microscopes operating at low incident electron energies:. A monochromator improves the attainable precision.. The attainable precision improves more with either a chromatic aberration corrector or a monochromator than with a spherical aberration corrector.. The precision that is gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be substantial. V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy A. Introduction In this section, optimal statistical experimental designs of STEM experiments will be described. They will be computed in a similar way as those of CTEM experiments in Section IV. Hence, the STEM designs will be evaluated and optimized in terms of the attainable precision, that is, the CRLB, with which atom column positions of the object under study can be measured. The choice of this optimality criterion reflects the purpose of future atomic resolution TEM experiments. As mentioned in Section I, this purpose is quantitative structure determination, which means that the structure parameters of the object under study, the atom column positions in particular, are quantitatively estimated from the observations. Ultimately, this should be done as precisely as possible. First, it will be described how STEM observations are collected. A scheme is shown in Figure 36. An electron probe is formed by demagnifying a small electron source with a set of condenser and objective lenses. The resulting probe scans in a raster over the object. At each probe position, a part of the object under study is illuminated. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. This wave propagates to a detector, which is placed in the back focal plane beyond the object. In this plane, a so-called convergent-beam electron divraction pattern is formed. The part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In STEM, one distinguishes diverent imaging modes that are related to the shape or size of the detector such as axial bright-field coherent STEM and annular dark-field incoherent STEM.

105 QUANTITATIVE ATOMIC RESOLUTION TEM 105 Figure 36. Scheme of a STEM experiment. Usually, an annular (black colored) or axial (grey colored) detector is chosen. The angle a D represents the inner collection semi-angle of an annular detector or the outer collection semi-angle of an axial detector. In the former mode, an axial detector with a small outer collection semiangle a D is used, whereas in the latter mode, an annular detector with a large inner collection semi-angle a D is used. The angle a D is shown in Figure 36. It corresponds to a detector radius g det equal to a D /l, where l is the electron wavelength. For more details on STEM, the reader is referred to Batson, Dellby, and Krivanek (2002); Cowley (1997), Crewe (1997), Nellist and Pennycook (2000), and Pennycook and Yan (2001). For many years, it has been standard practice to evaluate the performance of STEM experiments qualitatively, that is, in terms of direct visual interpretability. The performance criteria used are two-point resolution and contrast. For example, when axial bright-field coherent STEM is compared to annular dark-field incoherent STEM, the latter imaging mode is preferred. The basic ideas underlying this preference are the improvement of two-point resolution for incoherent imaging compared to coherent imaging (Pennycook, 1997) and the higher contrast in dark-field images than in bright-field images (Cowley, 1997). In annular dark-field incoherent STEM, visual interpretation of the images is optimal if the Scherzer conditions (Scherzer, 1949) for incoherent imaging are adapted (Pennycook and Jesson, 1991). As demonstrated in (Nellist and Pennycook, 1998), the resolution may be further improved if the main lobe of the probe is narrowed. However, visual interpretability is then reduced as a result of a considerable rise of the sidelobes of the probe.

106 106 VAN AERT ET AL. Two important aspects are absent in these widely used performance criteria. First, the electron-object interaction is not taken into account. Second, the dose eyciency, which is defined as the ratio of the number of detected electrons to the number of incident electrons, is left out of consideration. Improvement of resolution and contrast is often obtained at the expense of dose eyciency, which leads to a decrease in the SNR. For example, the incoherence in annular dark-field incoherent STEM is attained by using an annular dark-field detector with a geometry much larger than the objective aperture, that is, an annular detector with an inner collection semi-angle much larger than the objective aperture semi-angle (Nellist and Pennycook, 2000). Its corresponding improvement of two-point resolution, by adapting the Scherzer conditions for incoherent imaging, is thus obtained at the expense of dose eyciency. Another example is the following. It is well known that in bright-field images, decreasing the outer collection semi-angle of an axial detector leads to higher contrast, but also to a deterioration of the SNR, which deteriorates the quality of an image. To compensate for such a decrease in SNR, longer recording times are necessary, which in turn increase the disturbing influence of specimen drift. The observation that the quality of an image is determined by both the resolution and the SNR has led to several modified criteria (Sato, 1997; Sato and OrloV, 1992). The ultimate goal of STEM is not qualitative structure determination, but quantitative structure determination instead. Ultimately, structure parameters of the object under study, such as the atom column positions, have to be measured as precisely as possible. However, this precision will always be limited by the presence of noise. Given the parametric statistical model of the observations, an expression may be obtained for the highest attainable precision with which the atom column positions can be measured. This expression, which is called the CRLB, is a function of structure parameters, microscope settings, and dose eyciency. Therefore, it may be used as an alternative performance measure in the evaluation and optimization of the design of a STEM experiment for a given object. The optimal statistical experimental design corresponds to the microscope settings resulting in the highest attainable precision. It will be obtained by using the principles of statistical experimental design explained in Section II. The section is organized as follows. In Section V.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section V.C, an expression for the CRLB on the variance of the atom column position estimates is obtained from this model. Also, an adequate optimality criterion, which is a function of the elements of the CRLB, is given. This criterion is then used to evaluate and optimize the experimental design. Special attention will be paid

107 QUANTITATIVE ATOMIC RESOLUTION TEM 107 to the optimal reduced brightness of the electron source, the optimal defocus value, the optimal spherical aberration constant, the optimal detector radius, and the optimal source width. Furthermore, it will be investigated if an annular detector is preferable to an axial one. In Section V.D, conclusions are drawn. Part of the results of this section has earlier been published in den Dekker, van Aert, van Dyck, and van den Bos (2000), van Aert and van Dyck (2001), van Aert, den Dekker, van Dyck, and van den Bos (2000a, 2002b), van Aert, van Dyck, den Dekker, and van den Bos (2000). B. Parametric Statistical Model of Observations A parametric statistical model of the observations is needed in order to obtain an expression for the CRLB, which will be used for the optimization of the experimental design. In this section, such a model will be derived. It describes the expectations of the observations as well as the fluctuations of the observations about these expectations. This model contains microscope settings such as defocus, spherical aberration constant, and detector angle, as well as structure parameters such as atom column positions and the object thickness. In the derivation of this model, three basic approximations will be made. First, use will be made of the simplified channelling theory to describe the dynamical, elastic scattering of the electrons on their way through the object (Broeckx, Op de Beeck and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck and van den Bos, 2002b; van Dyck and Op de Beeck, 1996 ). Second, temporal incoherence due to chromatic aberration, which results from a spread in defocus values, will not be taken into account. This approximation is justified by the fact that researchers suspect that STEM imaging is robust to chromatic aberration (Batson, Dellby and Krivanek, 2002; Krivanek, Dellby and Nellist, 2002; Nellist and Pennycook, 1998, 2000). Third, thermal divuse, inelastic scattering will not be taken into account. Thermal divuse scattered electrons are predominantly collected in the detector at high angles (Treacy, 1982). Therefore, increasing the inner collection semiangle a D (see Figure 36) of an annular detector has the evect of increasing thermal divuse, inelastic scattering relative to elastic scattering (Wang, 2001). The main advantage of this is the strong dependence of the detected signal on the atomic number Z, hence the name, Z-contrast imaging. The disadvantage, however, is the accompanied decrease of dose eyciency, which leads to a decrease in SNR. In Section V.C, it will be shown that, as a result of this decrease in SNR, the optimal inner collection angle in terms of precision is small compared to the angles where thermal divuse scattering is

108 108 VAN AERT ET AL. important. This justifies the fact that thermal divuse scattering will not be taken into account. Although the approximations made are of a limited validity, they are useful for a compact analytical model-based optimization of the design of quantitative STEM experiments as well as for explaining the basic principles governing the obtained results. The principal results are independent of the approximations made. 1. The Exit Wave The first step toward the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). It is a complex electron wave function in the plane at the exit face of the object, resulting from the interaction of the electron probe with the object. As for CTEM, use will be made of the simplified channelling theory. At this stage, both structure parameters and microscope settings, describing the object and probe, respectively, will enter the model. According to the simplified channelling theory, applicable if the probe propagates along a major zone axis, an expression may be derived for the exit wave of an object consisting of n c atom columns (Broeckx, Op de Beeck, and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck, and van den Bos, 2002b; van Dyck and Op de Beeck, 1996). This derivation is equivalent to that of the exit wave for CTEM, given by Eq. (84), except that the parallel incident electron beam used in CTEM is now replaced by the electron probe. The expression for the exit wave for STEM is given by: cðr; zþ ¼ pðr r kl Þþ Xn c c n ðr kl b n Þf 1s;n ðr b n Þ exp ip E 1s;n 1 E n¼1 0 l z 1 ; ð121þ where r ¼ðxyÞ T is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the propagation direction of the electron probe, z is the object thickness, E 0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related according to Eq. (85). Furthermore, the function p(r r kl ) describes the probe located at the position r kl ¼ðx k y l Þ T. The function f 1s;n ðr b n Þ is the lowest energy bound state of the nth atom column located at position b n ¼ðb xn b yn Þ T and E 1s,n is its energy. The energy of this state is a parameter related to the projected weight of the atom column, which is a function of the atom numbers of the atoms along a column, the distance between successive atoms, and the Debye-Waller factor (van Dyck and Chen, 1999a). The lowest energy bound state f 1s;n ðr b n Þ is

109 QUANTITATIVE ATOMIC RESOLUTION TEM 109 a real-valued, centrally peaked, radially symmetric function, which is a twodimensional analogue of the 1s-state of an atom. In Eq. (121), it is assumed that the dynamical motion of an electron in a column may be primarily expressed in terms of this tightly bound 1s-state. As in Section IV.B.1, where an expression for the exit wave is described for CTEM, it will be assumed that the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function given by Eq. (88) (Geuens and van Dyck, 2002). The excitation coeycients c n ðr kl b n Þ of Eq. (121) are found from: Z c n ðr kl b n Þ ¼ f 1s;nð r b nþpðr r kl Þdr; ð122þ where the symbol * denotes the complex conjugate. Since the 1s-state is a real-valued function and since the probe function is assumed to have radial symmetry so that pðrþ ¼pð rþ, Eq. (122) may be written as a convolution product: c n ðr kl b n Þ ¼ pðr kl b n Þf 1s;n ðr kl b n Þ: ð123þ The convolution theorem (Papoulis, 1968) allows one to rewrite this equation as: c n ðr kl b n Þ ¼ = 1 g!r kl b n PðgÞF 1s;n ðgþ; Z ¼ PðgÞF 1s;n ðþexp g ð i2pg: ðr kl b n ÞÞdg; ð124þ where P(g) is the two-dimensional Fourier transform of the probe function p(r), F 1s,n (g) is the Fourier transform of the 1s-state f 1s,n (r) given by Eq. (90), g is a two-dimensional spatial frequency vector, and the symbol. denotes the scalar product. The Fourier transform and the inverse Fourier transform are defined by Eqs. (91) and (92), respectively. For radially symmetric 1s-state and probe functions, Eq. (124) may be written as: c n ðr kl b n Þ ¼ c n ðjr kl b n jþ ¼ 2p Z 1 0 Pg ð ÞF 1s;n ðgþj 0 ð2pgjr kl b n jþgdg: ð125þ This is an elementary result of the theory of Bessel functions, where J 0 (.) is the zeroth-order Bessel function of the first kind and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jr kl b n j¼ ðx k b xn Þ 2 2 þ y l b yn ð126þ represents the distance from the probe to the nth atom column.

110 110 VAN AERT ET AL. The illuminating STEM probe p(r) is the inverse Fourier transform of the coherent transfer function of the objective lens P(g): pðþ¼= r g!r 1 PðgÞ: ð127þ The transfer function P(g) is radially symmetric and given by: PðgÞ ¼ Pg ð Þ ¼ Ag ðþexpð iwðgþþ; ð128þ where g ¼jgj is the Euclidean norm of the two-dimensional spatial frequency vector. The circular aperture function A(g) is defined in the same way as in Eq. (95): ( Ag ðþ¼ 1 if g g ap ð129þ 0 if g > g ap with g ap the objective aperture radius. Notice that the objective aperture semiangle a 0 is equal to g ap l. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and is defined in the same way as in Eq. (96) (van Dyck, 2002): wðgþ ¼ p"lg 2 þ 1 2 pc sl 3 g 4 ð130þ with " the defocus, l the electron wavelength, and C s the spherical aberration constant. Other aberration evects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, could also be included in this phase shift (Thust, Overwijk, Coene, and Lentzen, 1996). From the comparison of Eq. (128) with Eq. (94), where the microscope s transfer function for CTEM is described, it follows that, apart from the damping envelope functions describing partial spatial and temporal coherence in CTEM, both equations are equal to one another. In the present work, temporal incoherence will not be taken into account since STEM imaging is suspected to be robust to chromatic aberration (Batson, Krivanek, Dellby and Nellist, 2002; Dellby and Krivanek, 2002; Nellist and Pennycook, 1998, 2000). Furthermore, spatial incoherence, resulting from a finite source image, will be incorporated in the model in the next section. 2. The Image Intensity Distribution From the expression for the exit wave, which has been obtained in the previous section the image intensity distribution may be computed. The exit wave, as given by Eq. (121), describes the interaction of the electron probe, which is located at a given position, and the object. The steps needed in proceeding from the exit wave to the image intensity distribution are the

111 QUANTITATIVE ATOMIC RESOLUTION TEM 111 following ones. First, the propagation from this exit wave to the detector, which is placed in the back focal plane beyond the object, is described as the Fourier transform of the exit wave. Next, the intensity pattern in the detector plane is given by the modulus square of the thus obtained wave. This is the so-called convergent-beam electron divraction pattern. Then, the part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In this way, an expression for the image intensity distribution may be obtained. At this stage, the microscope parameters describing the detector will enter the model. From the procedure described above, it follows that the total detected intensity in the Fourier detector plane of a STEM is given by (Cowley, 1976): Z I ps ðr kl Þ ¼ jcðg; zþj 2 DðgÞdg; ð131þ where C(g, z) is the two-dimensional Fourier transform of the exit wave c(r, z) and C(g, z) 2 describes the convergent-beam electron divraction pattern. Furthermore, D(g) is the detector function, which is equal to one in the detected field and equal to zero elsewhere. An expression for the twodimensional Fourier transform of the exit wave may be obtained from combining Eqs. (91) and (121): Cðg; zþ ¼PðgÞ expð2pig r kl Þ þ Xn c c n ðr kl b n ÞF 1s;n ðgþ expð2pig b n Þ exp ip E 1s;n 1 E 0 l z 1 : n¼1 ð132þ Notice that it can be seen from Eqs. (131) and (132) that for identical atom columns, the contrast varies periodically with thickness. This periodicity is the same as for CTEM, given by Eq. (107): D 1s ¼ 2E 0l : ð133þ E 1s;n It is called the extinction distance. This periodic oscillation is due to dynamical evects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diverent for diverent types of atom columns. Thus far, it has been tacitly assumed that the source image may be modelled as a point. Therefore, the subscript ps in Eq. (131) refers to point source. Elaborating on the ideas given in (Mory, Tence, and Colliex, 1985), it follows that the finite size of the source image may be taken into account

112 112 VAN AERT ET AL. by a two-dimensional convolution of the intensity distribution I ps (r kl ) with the intensity distribution of the source image S(r): Iðr kl Þ ¼ I ps ðr kl ÞSðr kl Þ: ð134þ The evect of the source image is thus an additional blurring. A realistic form for the intensity distribution of the source image is Gaussian (Mory, Tence, and Colliex, 1985). The function S(r) is thus a two-dimensional normalized Gaussian distribution given by: SðÞ¼Sr r ðþ¼ 1 r2 exp 2ps2 2s 2 ; ð135þ with s the standard deviation, representing the width corresponding to the radius containing 39% of the total intensity of S(r). Up to now, no assumptions have been made about the shape or size of the detector. From now on, however, the detector is assumed to be radially symmetric. Mathematically, this means that DðgÞ ¼DðgÞ. Insight in the expression given by the right-hand member of Eq. (134) is obtained if it is split up into three terms: Iðr kl Þ ¼ I 0 þ I 1 ðr kl ÞþI 2 ðr kl Þ: ð136þ The zeroth order term I 0 corresponds to a non-interacting probe, the first order term I 1 (r kl ) to the interference between the probe and the 1s-state and the second order term I 2 (r kl ) to the interference of diverent 1s-states. The zeroth order term I 0 is given by: Z I 0 ¼ jpðgþ expð2pig r kl Þj 2 DðgÞdg Sðr kl Þ: ð137þ It describes a constant background intensity, resulting from the noninteracting electrons collected by the detector. This equation may be rewritten by substitution of Eq. (128) and using the fact that D(g) is radially symmetric. This results in: Z I 0 ¼ 2p A 2 ðgþdg ð Þgdg: ð138þ Due to the definition of the aperture function, given by Eq. (129), the following equality may be used: Therefore, Eq. (138) becomes: Z I 0 ¼ 2p A 2 ðgþ ¼ Ag ð Þ: Ag ð ÞDg ð Þgdg: ð139þ ð140þ

113 QUANTITATIVE ATOMIC RESOLUTION TEM 113 The first order term I 1 (r kl ) corresponds to the interference of the incident probe p(r r kl ) and the 1s-state f 1s, n (r b n ): I 1 ðr kl Þ ¼ Xn c 2Re c n ðjr kl b n jþ exp ip E 1s;n 1 E n¼1 0 l z 1 Z ð141þ 2p P ðgþf 1s;n ðgþj 0 ð2pgjr kl b n jþdg ð Þgdg Sðr kl Þ: This is a linear term in the sense that contributions of diverent atom columns are added. The second order term I 2 (r kl ) describes the interference of diverent 1s-states f 1s;n ðr b n Þ and f 1s;m ðr b m Þ: I 2 ðr kl Þ ¼ Xn c X n c n¼1 m¼1 c n ðjr kl b n jþc mð jr kl b m jþ exp ip E 1s;n 1 E 0 l z 1 exp þip E 1s;m 1 E 0 l z 1 Z 2p F 1s;n ðgþf 1s;m ðgþj 0 2pgd n;m Dg ð Þgdg Sðr kl Þ; ð142þ where qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d n;m ¼jb n b m j¼ ðb xn b xm Þ 2 2 þ b yn b ym ; ð143þ is the distance between the atom columns at positions b n and b m. It is only the last term I 2 (r kl ) of Eq. (136) that remains for annular darkfield STEM using an annular detector with an inner collection radius g det ¼ a D =l greater than or equal to the objective aperture radius g ap ¼ a 0 =l. The terms I 0 and I 1 (r kl ) of Eq. (136) given by Eqs. (140) and (141), respectively, are equal to zero since: Pg ð ÞDg ð Þ ¼ 0; ð144þ or, equivalently, Ag ð ÞDg ð Þ ¼ 0: ð145þ Therefore, Eq. (142) describes the image intensity distribution for annular dark-field STEM. It can be shown that this result agrees with the result as derived in (Pennycook, RaVerty, and Nellist, 2000). 3. The Image Recording Finally, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. In a STEM, the illuminating electron probe scans in a raster over the object. The image is

114 114 VAN AERT ET AL. thus recorded as a function of the probe position r kl ¼ðx k y l Þ T. Without loss of generality, the image magnification is ignored. Therefore, the probe position r kl ¼ðx k y l Þ T directly corresponds to an image pixel at the same location. The recording device is characterized as consisting of K L equidistant pixels of area Dx Dy, where Dx and Dy are the probe sampling distances in the x and y directions, respectively. Pixel (k, l ) corresponds to position ðx k y l Þ T ðx 1 þðk 1ÞDxy 1 þðl 1ÞDyÞ T with k ¼ 1;...; K and l ¼ 1;...; L and (x 1 y 1 ) T represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is chosen centered about (0 0) T. Assuming a recording time t for one pixel and a probe current I, the number of electrons per probe position is given by: It ð146þ e with e ¼ 1: C the electron charge. The recording time t for one pixel is the ratio of the recording time t for the whole FOV to the total number of pixels KL: t ¼ t KL : The total number of incident electrons N i is equal to: N i ¼ KL It e : The probe current I is given by (Barth and Kruit, 1996): ð147þ ð148þ I ¼ B re 0 p 2 di50 2 a2 o ð149þ 4e with B r the reduced brightness of the electron source, E 0 the incident electron energy, d I50 the diameter of the source image containing 50% of the current and a o the objective aperture semi-angle. From Eq. (135), it follows that p d I50 ¼ 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ln0:5 s: ð150þ As a consequence of the detector shape and size in STEM, only the electrons within a selected part of the convergent-beam electron divraction pattern are used to produce the image. Mathematically, this is expressed in Eq. (131). The selected part is determined by the detector function D(g). Suppose that f kl represents the fraction of electrons collected by the detector. Then, the expected number of electrons l kl at the pixel (k, l ) equals (Reimer, 1993):

115 QUANTITATIVE ATOMIC RESOLUTION TEM 115 It l kl ¼ f kl e : The fraction f kl, which is smaller than 1, may be expressed as: f kl ¼ I ð r klþ I D¼1 ð151þ ð152þ with I(r kl ) given by Eq. (134) and I D¼1 the constant intensity obtained if the detector function D(g) is uniform. From straightforward calculations, using Eqs. (136) (142), it follows that: Z I D¼1 ¼ 2p Ag ð Þgdg: ð153þ The total number of detected electrons N to form the image is now equal to: N ¼ XK X L k¼1 l¼1 f kl It e : ð154þ Then, the dose eyciency DE, which is defined as the ratio of the number of detected electrons to the number of incident electrons, becomes: DE ¼ N P K P L k¼1 l¼1 ¼ f kl : ð155þ N i KL This follows directly from Eqs. (148) and (154). For STEM, the observation are electron counting results, which are supposed to be Poisson distributed and statistically independent. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K L and the expectation model is given by Eq. (151). The parameter vector b ¼ðb x1...b xnc b y1...b ync Þ T consist of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, optimal statistical experimental designs of STEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the

116 116 VAN AERT ET AL. atom columns can be estimated. The STEM observations are described by the parametric statistical model derived in Section V.B. This model will be used to obtain an expression for the attainable precision, which is represented by the CRLB associated with the position coordinates. In Section II, it has been explained how an expression for the CRLB may be derived. Next, a scalar measure of this CRLB, that is, a function of the matrix elements of the CRLB, will be chosen as optimality criterion. This criterion will then be evaluated and optimized as a function of the microscope settings. An overview of these microscope settings will be given in Section V.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section V.C.2, the results of the numerical evaluation and optimization of the microscope settings will be presented for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section V.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Parameters An overview of the microscope settings, which enter the parametric statistical model of the STEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the electron probe are the defocus ", the spherical aberration constant C s, the objective aperture radius g ap, the electron wavelength l, the width of the source image s, and the reduced brightness B r of the electron source. The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. In the evaluation of the experimental design, the electron wavelength will be kept constant. Furthermore, the evect of the reduced brightness on the precision with which atom column positions can be estimated, will be studied. For most electron microscopes, the spherical aberration constant is a fixed property of the microscope as well, however, by incorporating a spherical aberration corrector, it is tunable. Therefore, it is interesting to study the evect of the spherical aberration constant on the precision. The microscope settings specifying the detector configuration are related to the detector function D(g). In principle, the detector may have any shape

117 QUANTITATIVE ATOMIC RESOLUTION TEM 117 or size. However, in this article, the shape of the detector is confined to the more common ones, which are annular and axial detectors. The inner or outer collection radius g det or semi-angle a D, which are related as g det ¼ a D =l, is tunable. The microscope settings describing the image recording are the probe sampling distances or, equivalently, the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy are kept constant. In agreement with the results presented in Section III, it can be shown that the precision will generally improve with smaller pixel sizes for a constant total number of incident electrons N i, as defined by Eq. (148). However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel SNR decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, is chosen fixed, but large enough so as to guarantee that the tails of the electron probe are collected in the FOV. 2. Numerical Results In this section, the experimental designs will be numerically evaluated and optimized in terms of the attainable precision with which atom column positions can be estimated. This section will be divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, the optimal experimental designs for isolated atom columns will be computed. Third, the influence of neighboring atom columns on the optimal experimental design will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. An overview of the original microscope settings and the structure parameters of the objects under study will be given. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 15, unless otherwise mentioned. These are typical values used in today s

118 118 VAN AERT ET AL. TABLE 15 Original Microscope Settings Microscope setting Value E 0 (kev) 300 l(å) 0.02 B r ðam 2 sr 1 V 1 Þ C s (mm) 0.5 Dx(Å) 0.2 Dy(Å) 0.2 K 100 L 100 t(s) STEM experiments. Furthermore, in the conventional approach, which is based on direct visual interpretability, the Scherzer conditions for incoherent imaging are usually applied (Pennycook and Jesson, 1991; Scherzer, 1949). Under these conditions, the objective aperture radius and defocus are given by: g ap ¼ 1 4l 1=4 ; l C s ð156þ " ¼ C ð s lþ 1=2 respectively. For the microscope settings as given in Table 15, this corresponds to g ap ¼ 0:56 Å 1 and " ¼ 320 Å. Moreover, the outer collection radius of an axial detector or the inner collection radius of an annular detector is usually taken much smaller or larger than the objective aperture radius, respectively (Nellist and Pennycook, 2000). To the author s knowledge, explicit expressions for these radii do not exist. One of the guidelines that has been found in the literature is, for example, that the inner collection radius g det of an annular detector should be at least three times the objective aperture radius (Hartel, Rose, and Dinges, 1996). Therefore, if g ap is equal to 0.56 Å 1, this corresponds to a value of g det being larger than 1.68 Å 1. Other researchers propose a value of two times the objective aperture radius, which is representative for a typical Crewe detector (Pennycook, Jesson, Chisholm, Browning, McGibbon, and McGibbon, 1995). For g ap being equal to 0.56 Å 1, this corresponds to g det being equal to 1.12 Å 1. It should be noticed that for such large values of the detector radius, thermal divuse, inelastic scattering may be more important than elastic scattering. Consequently, the expectation model proposed in Section V.B, which only takes elastic scattering into account, is no longer valid. For example, the oscillation of the detected intensity as a function of thickness with a periodicity as given by Eq. (133) is no longer observed using annular

119 QUANTITATIVE ATOMIC RESOLUTION TEM 119 detectors with a large inner detector radius. In Pennycook and Yan (2001), this oscillation as a function of thickness has been studied for a rhodium atom column, where the distance between successive atoms is equal to 2.7 Å. This has been done for a small, medium, and large detector radius corresponding to a value of 0.75 Å 1, 1.50 Å 1, and 2.25 Å 1, respectively. From this study, it followed that the periodic oscillation as described by the model given in Section V.B applies for the small detector radius, whereas this oscillation is almost completely suppressed for the large detector radius. Therefore, in the present study, the evaluation of the inner detector radius of annular detectors will be restricted to small values. It will be shown that this constraint does not cause problems for the computation of the optimal detector radius in terms of attainable precision. The optimal value of the detector radius will be shown to be much smaller than the values usually taken. In the remainder of this section, the values for the microscope settings which are usually preferred in STEM experiments, as described above, will be compared to their optimal values in terms of attainable precision. These optimal values are found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings of which the dimension is equal to the number of microscope settings. It has been found that some of these microscope settings are strongly correlated. This implies that the optimization cannot be performed one at a time. For example, it will be shown that the optimal detector radius strongly depends on the aperture radius. Furthermore, the optimal defocus value strongly depends on the spherical aberration constant. In what follows, the results following from this simultaneous optimization procedure will be described setting by setting. The relation of each setting to other microscope settings will be mentioned if necessary. In what follows, the attainable precision will be computed as a function of the following microscope settings:. Objective aperture radius. Radius of annular and axial detectors. Defocus. Spherical aberration constant. Reduced brightness of the electron source. Width of the source image For isolated atom columns, the width of the source image, which is determined by Eq. (135), will be kept constant in the following sense. The diameter d I50 of the source image containing 50% of the current will be assumed to be determined by the objective aperture angle a o, following the relation (Barth and Kruit, 1996)

120 120 VAN AERT ET AL. d I50 ¼ 0:54l : ð157þ a o The right-hand member of this equation is equal to the diameter of the divraction-error disc containing 50% of the total intensity. Consequently, the contribution of the source image to the total probe size is rather small. Then, meeting Eq. (157), it follows from Eq. (149) that the probe current is constant and equal to I B ¼ B r (Barth and Kruit, 1996). This implies that the total number of incident electrons per square Å is constant as a function of the microscope settings for a fixed recording time. The reason why the diameter of the source image will be assumed to be determined by the divraction-error disc, instead of assuming it to be tunable, is the following one. For isolated atom columns, the optimal diameter d I50 would be infinite, corresponding to an infinite probe current, as follows from Eq. (149). However, an infinite source image is not realistic since neighboring atom columns will then strongly overlap. Therefore, the dependence of the tunable source diameter on the precision will be studied for neighboring atom columns only. ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for diverent types of atom columns. The atom columns which will be considered are given in Table 16 as well as the corresponding width of the 1s-state a n, its energy E 1s,n, the interatomic distance d, that is, the distance between successive TABLE 16 Width of the Is-State, Its Energy (Debye-Waller Factor ¼ 0.6 Å 2 and E 0 ¼300 kev), Interatomic Distance, and Atomic Number for Different Atom Columns Column type Structure parameter Si [100] Si [110] Sr [100] a n (Å) E 1s,n (ev) d(å) Z Column type Structure parameter Sn [100] Cu [100] Au [100] a n (Å) E 1s,n (ev) d(å) Z

121 QUANTITATIVE ATOMIC RESOLUTION TEM 121 atoms along a column, and the atom number Z of these atoms. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts. b. Isolated Atom Columns i. Structure Parameters. For isolated atom columns, the atom column positions and the object thickness are given in Table 17. The object thickness is equal to half the extinction distance, which is given by Eq. (133). From the proposed model in Section V.B, it follows that at this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions. ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which its position coordinates b ¼ðb x b y Þ T can be estimated. This attainable precision (in terms of the variance) is represented by the diagonal elements s 2 b x and s 2 b y of the CRLB. These elements are theoretical lower bounds on the variance with which the position coordinates can be estimated without bias. An expression for them will be derived in the following paragraph. This derivation is completely analogous to the one presented in Section IV.C.2, for CTEM and may therefore be skipped by the reader who is already familiar with it. For an isolated atom column, the CRLB is equal to the inverse of the 2 2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): F rs ¼ XK X L k¼1 l¼1 1 l s ð158þ with l kl the expected number of electrons at the pixel (k, l ). An expression for the elements F rs is found by substitution of the expectation model given by Eq. (151) as derived in Section V.B and its derivatives with respect to the position coordinates into Eq. (158). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for F rs. TABLE 17 Structure Parameters of an Isolated Atom Column Structure parameter Value b x (Å) 0 b y (Å) 0 E z(å) 0 l E 1s;n

122 122 VAN AERT ET AL. For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another: s 2 b x ¼ s 2 b y ¼ F 1 ð159þ 11 with [F 1 ] 11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation s bx and s by, that is, the square root of the right-hand member of Eq. (159). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the objective aperture radius, the radius of annular and axial detectors, the defocus, the spherical aberration constant, and the reduced brightness of the electron source. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The relevant constraint is either the radiation sensitivity of the object under study or the specimen drift. Therefore, either the incident electron dose per square Å or the recording time has to be kept within the constraints. iii. Optimal Objective Aperture Radius. First, the dependence of the precision on the objective aperture radius g ap is studied. Recall that the objective aperture radius is directly related to the objective aperture semiangle a o according to the formula a o ¼ g ap l. The precision, which is represented by the square root of the right-hand member of Eq. (159), has been evaluated as a function of the objective aperture radius for annular as well as for axial detectors and for diverent atom column types. From this evaluation, it is found that the optimal aperture radius is mainly determined by the atom column type under study and that it is the same for annular and axial detectors. The optimal aperture radius turns out to be proportional to the width of the function F 1s,n (g), that is, the Fourier transform of the 1sstate f 1s,n (r) as given by Eq. (90). Figure 37 compares the optimal aperture radius with the width of F 1s,n (g). The width of F 1s,n (g) is equal to 1/(4pa n ), where a n is the width of the 1s-state f 1s,n (r) as given by Eq. (88). The optimal aperture radii are plotted as a function of ðd 2 =Z þ 0:276BÞ 1=2, since this term is more or less proportional to the width of the function F 1s,n (g) as shown in (van Dyck and Chen, 1999a). For a given atom column, d represents the interatomic distance, Z the atomic number, and B the Debye-Waller factor. From Figure 37, it is clear that the influence of the object on the optimal objective aperture radius is substantial. In contrast to what one might

123 QUANTITATIVE ATOMIC RESOLUTION TEM 123 Figure 37. Comparison of the optimal aperture radius for C s being equal to 0.5 mm with the width of the Fourier transformed 1s-state F 1s,n (g) for diverent atom column types. The width of F 1s,n (g) is proportional to (d 2 /Z B) 1/2. expect, the resulting probe in the optimal design is not as narrow as possible. Its main lobe is even broader than the 1s-state f 1s,n (r). This is shown in Figure 38, where both the 1s-state and the amplitude of the optimal probe are shown for a silicon and a gold [100] atom column. Furthermore, for heavy atom columns such as gold [100], an increase of the spherical aberration constant results in a decrease of the optimal aperture radius and vice versa. For this column, the optimal aperture radius is equal to 0.75 Å 1 for C s being equal to 0 mm, whereas it is equal to 0.50 Å 1 for C s being equal to 0.5 mm. For lighter atom columns such as silicon[100], the optimal aperture radius is independent of the spherical aberration constant. For a silicon [100] atom column, the optimal aperture radius is equal to 0.28 Å 1 for both C s being equal to 0 mm and 0.5 mm. It should be noticed that the foregoing analysis was done for object thicknesses equal to half the extinction distance as follows from Eq. (133) and Table 17. However, the conclusions remain the same for thicknesses diverent from half the extinction distance. Also, it should be mentioned that these conclusions are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. From the discussion given above, it follows that there is a fundamental diverence between the optimal aperture radius in terms of the attainable precision and the aperture radius as given by Eq. (156), which is assumed to

124 124 VAN AERT ET AL. Figure 38. The dashed curve of the left- and right-hand figure represents the 1s-state f 1s, n (r) for a silicon [100] and a gold [100] atom column, respectively. The solid curves represent the amplitude of their associated optimal electron probes, that is, p(r), for C s being equal to 0.5 mm. be optimal in terms of direct visual interpretability. The former depends more on the width of the 1s-state of the column under study than on the spherical aberration constant. The latter depends on the spherical aberration constant, but is independent of any structure parameter. iv. Optimal Detector Configuration. Next, the optimal detector configuration in terms of precision is described. In Figure 39, the precision with which the position coordinates of an isolated silicon [100] atom column can be estimated, is evaluated as a function of the detector-to-aperture radius, that is, g det /g ap. For annular detectors, g det represents the inner collection detector radius, whereas for axial detectors, it represents the outer collection detector radius (see Figure 36). The objective aperture radius and the defocus are set to their optimal values of 0.28 Å 1 and 80 Å, respectively. From this figure, the following conclusions may be derived:. For an annular detector, the optimal detector radius equals the optimal aperture radius.. For an axial detector, the optimal detector radius is slightly smaller than the optimal aperture radius.. An annular detector results in higher precisions than an axial detector when operating at the optimal conditions.

125 QUANTITATIVE ATOMIC RESOLUTION TEM 125 Figure 39. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the detector-to-aperture radius for an annular and an axial detector. The objective aperture radius and the defocus are set to their optimal values of 0.28 Å 1 and 80 Å, respectively. It should be mentioned that in Figure 39, the size of the optimal detector radius of the annular detector is of the same order as the size of the aperture radius. For such detector radii, thermal divuse scattering is unimportant. As mentioned earlier in this section, thermal divuse scattering is not included in the expectation model given in Section V.B. The reader may wonder if the precision would be higher by using a large detector radius so that thermal divuse scattering is dominant. This is not to be expected, since the precision in terms of the lower bound on the standard deviation is inversely proportional to the square-root of the total number of detected electrons, that is, the signal-to-noise ratio, which in its turn is inversely proportional to the detector radius. It is unlikely that the decrease of the total number of detected electrons by using a large detector radius may be compensated by the fact that only thermal divuse scattered electrons are detected. Furthermore, it should be mentioned that the conclusions obtained from Figure 39 are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. In Figure 39, the recording time as well as the number of incident electrons per square Å are fixed. The optimal detector settings do not change if, for example, longer recording times or more incident electrons per square Å would be allowed. For diverent values of the recording time or the number of incident electrons per square Å, only the actual values for the standard deviation ascribed to Figure 39 would be diverent, whereas the optimal detector settings would be the same.

126 126 VAN AERT ET AL. As mentioned earlier, the detector radius is usually taken much smaller or larger than the objective aperture radius for an axial or annular detector, respectively, thus aiming at optimal direct visual interpretability. However, this is typically not found if the attainable precision is used as optimality criterion. Then, the optimal detector radius is almost equal to the aperture radius. This has to do with the fact that the signal-to-noise ratio decreases with decreasing or increasing radius of axial or annular detectors, respectively. The finding that the optimal detector radius of an annular detector equals the optimal aperture radius is in agreement with the result found in (Rose, 1975). In that paper, the annular detector was optimized in terms of signal-to-noise ratio. Thus far, however, this guideline is usually not followed in practice since one seems to prefer direct visual interpretability above precision, even if this visual interpretability is accompanied with a low signal-to-noise ratio. v. Optimal Defocus Value. Subsequently, the dependence of the precisionon the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius. In Figure 40, the precision is evaluated for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant C s for a given electron wavelength l and for the objective aperture radius g ap and detector radius g det adjusted to their optimal values, both corresponding to 0.28 Å 1. This evaluation is done for an annular and axial detector. The solid white curves shown in Figure 40 are described by the relation " ¼ 1 2 C sl 2 g 2 ap : ð160þ The dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted white curves, it follows that the defocus value as described by Eq. (160) is close to the optimal defocus value in terms of precision. Moreover, for a given spherical aberration constant, the precision that is gained by operating at the corresponding optimal defocus instead of at the defocus given by Eq. (160) is hardly significant. Therefore, the optimal defocus value, as a function of the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius, is approximately given by the empirical relation as described by Eq. (160). At this defocus value, the transfer function is flattened in the sense that it is nearly equal to one over the whole angular range of the objective aperture. The optimal transfer function for a silicon [100] atom column and for a spherical aberration constant equal to 0.5 mm

127 QUANTITATIVE ATOMIC RESOLUTION TEM 127 Figure 40. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the spherical aberration constant and the defocus. The left- and right-hand figure represent the results for an annular and axial detector, respectively. The objective aperture radius and detector radius are adjusted to their optimal values, both corresponding to 0.28 Å 1. The solid white curves are described by Eq. (160) and the dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants. is presented in Figure 41, where the arrow represents the optimal objective aperture radius. Equation (160) is derived from Eq. (130) by setting the phase shift w(g) exactly to zero for g ¼ g ap with g ap the optimal objective aperture radius. These findings do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. From now on, the defocus will be adjusted to the value given by Eq. (160). From the comparison of the optimal defocus in terms of direct visual interpretability as given by Eq. (156) with the optimal defocus in terms of precision as given by Eq. (160), it follows that their relation to the objective aperture radius is equal for both optimality criteria. Nevertheless, the explicit numbers for the defocus are diverent since the optimal aperture radius is diverent for both optimality criteria.

Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy

Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy