Chi-square is defined as the sum of the square of the (observed values - expected values) divided by the expected values, or

2016 Midterm Exam (100 points total) Name: Astronomy/Planetary Sciences 518/418 Profs. Hinz/Rieke If you need more space, use the reverse side of the relevant page, but let us know by writing (over) at the lower right. Cell phone for questions: 520-245-8339. 1. Concept questions; define in a sentence or two the following concepts (4 points per question, total of 28) (full credit for five correct for Ast. 418): a. The Poisson distribution, including an example. The Poisson distribution is an approximation to the binomial distribution valid in the limiting case of a large number of events with a probability of success for each that is small. Each event must be a "yes" or a "no" - none in between. The equation is b. A prior distribution in Bayesian analysis. A prior distribution is the starting point for Beyesian analysis - it is the information available before being tainted by new data. It can take results from previous experiments, from related analyses, or anywhere that gives a reasonable starting point. c. Bootstrap Analysis This is a way of testing the validity of a statistical analysis. One draws sets of data randomly from the whole collection of data and analyzes them separately to be sure that they yield consistent results independently, and results consistent with that from the whole sample. d. Principle Component Analysis. This is a technique in which linear algebra is used to determine the minimum number of independent inputs needed to fit the data - an input might be some kind of spectral template of a suspected source component, for example. e. Exit Pupil This is where there is an image of the entrance pupil in an optical system. In the case of a telescope, it is where there is an image of the primary mirror downstream optically. f. detective quantum efficiency (DQE) The DQE is the ratio of the actual signal to noise out of a detector system to the intrinsic signal to noise in the input photon stream. The latter is typically the square root of the number of photons received in a given time interval. g. fully depleted CCD A fully depleted CCD is one built with a thick absorbing layer between the back side, where the photons enter, and the gates. This thick layer enhances the red quantum efficiency. To get the photoelectrons to the gates, a transparent contact is usually put on the back side and a voltage established across the absorbing layer.

2. Give short answers to the following questions, but more detail than the quick ones in (1.). (10 points each=30 total). a. Describe the central limit theorem and why it is so useful for statistical analysis. Be sure to describe the statistical distribution it predicts. The central limit theorem states that statistical samples that are large enough tend toward a Gaussian distribution. More formally, of your data has some given probability distribution (not necessarily Gaussian) for n numbers and you draw subsets of m numbers from it and take the averages of these subsets, they will tend toward a Gaussian distribution if n is large enough. This is a general result, no matter the initial probability distribution. It is therefore very useful because it provides the justification to use Gaussian distributions for many phenomena, even to use them without doing elaborate analyses to prove that they are the one and only appropriate one to use. Furthermore, Gaussian distributions have unique numerical properties that make them relatively easy to use in combination and other uses of multiple results. b. (extra credit for 418) Describe the use of the chi-squared statistic in fitting a model to a particular data set. Be sure to incorporate the impact of the number of data values, and model parameters on the result. Chi-square is defined as the sum of the square of the (observed values - expected values) divided by the expected values, or where o i is for the observed and e i for the expected values. Thus, if one has a model, it can be used to generate the expected value for each observation. One expects chi-square to grow in proportion to the number of observations, n. A correction in this expectation is that any parameter that is optimized does not contribute to chi-square - for example, if computing an average, the setting of the average in o i reduces the number of "degrees of freedom" in the fit by one. This equation is frequently modified to use chi-square as a measure of the goodness of fit by substituting the estimated measurement errors squared for the e i 's in the denominator. Then, chi-square given by the equation divided by the number of points minus the number of degrees of freedom is expected to be close to one. There is an expected distribution of chi-square with the number of points and number of free parameters - knowing these three values, one can use the distribution to determine the probability that the model fits the data within the errors.

c. What is the modulation transfer function (MTF)? Describe what it does that is more useful than simpler methods of measuring the resolution of an optical system. What computational technique is particularly useful in computing and manipulating MTFs? The MTF is a general description of a detector response as a function of the spatial frequency of the input image. It provides a more general description of the imaging characteristics of a system than can be provided by simpler metrics such as line pairs per millimeter, full width at half maximum of the image of a point source, or minimum separation of equal brightness sources to resolve them. This is because the MTF of an optical system is the product of the MTFs of its components. The MTF can be readily computed by taking the Fourier transform of the image provided by the optical element. The image of a series of elements is the convolution of the images from each one, so the power of the MTF lies in the convolution theorem for Fourier transforms, which states that the transform of the convolution of two functions is the product of the transforms of the two functions.

3. (20 points). Imagine you are interested in identifying Active Galactic Nuclei (AGN) in galaxies through spectroscopic observations. You would like to identify a sample of at least five AGN. How many galaxies should you survey in order to have high confidence you will identify at least five AGN, if the expected frequency of AGN is 3% of the total galaxy population? (4 points for the answer) By way of justifying your answer, make sure you identify the following (worth 4 points each, in addition to the answer itself): i. The choice of distribution ii. a reasonable confidence limit and explanation for your choice. iii. connection of the confidence limit to your calculation. iv. An explanation of why your answer is correct (imagine convincing a proposal reviewer). To address this situation, the correct framework is the binomial distribution. We might want to have a probability higher than 95% (P>95%) that we would obtain 5 or more galaxies with our observations (this can vary as long as you are consistent in your use, and justify why you chose a particular number). For the actual calculation, you start from the information that the probability of a single successful detection is expected to be 3% (so, p=0.03). Then, the probability of n=5 successes, given N observations is: In principle, one can simply plug in different values for N until P is higher than 0.9. However, you ll notice that this is computationally difficult (300! Is difficult for most calculators, for example). You ll remember that the Poisson distribution is a good approximation of the binomial distribution for large N and low p. In fact, the Gaussian distribution is a good approximation of the Poisson distribution, due to the central limit theorem. For the Gaussian distribution, we know that an event 2-sigma away from the mean has <5% chance of occurrence. For the binomial distribution, mean=np

and sigma~sqrt(np). So, to reach our confidence limit, we want mean- 2*sigma>5. This is satisfied when N>~400. A sample survey of 400 gives a high confidence that we will have the required sample size. We might consider requesting as low as 300, using similar arguments, but below that we would risk the data not being useful (if n>5 is essential.

4. (22 points) Consider the simple integrating amplifier to the right; assume R D = 10 19 ohms, C S = 30 X 10-15 Farads, and the amplifier and detector are held at a temperature of 120 K (as might be appropriate for a CMOS detector). Suppose the amplifier is read out as in (a) - after signal has been integrated for 1000 seconds, the signal is sampled at t 1, when the maximum signal has been integrated onto the gate of the FET, then the reset switch is closed to get rid of the integrated charge, the switch is opened, and another sample is taken at t 2. The signal is determined as the difference of the sample at t 1 and that at t 2. What noise would you expect (in electrons)? How would your answer change if instead the scheme in (b) is used, in which the signal is the difference between the sample at t 2 and that at t 1, and the reset switch is closed only after t 2? ---------------------------------------------------- In the first case, the time constant of the RC circuit controlling the charge of the FET is the controlled by the resistance of the FET when conducting and the capacitance C S, so it is very short and one will get the full ktc noise on the signal, = 0.000235 V Since q = VC, the charge is 0.000235 X 30e-15 = 7 X 10-18 C, or 44 electrons rms In the second case, the RC time constant is R D C S = 3 X 10 5 seconds. The response will therefore be e (1000/3e5) -1 of the full response, or 0.0033 of 44 electrons, or 0.15 electrons rms. This is clearly a very small value and is unlikely to be realized in a real circuit where other noise sources will be larger - so the ktc noise is basically reduced to a negligible value using the second sampling pattern.

Possibly Useful Equations and constants: k = 1.38 X 10-23 J K -1 q = 1.60 X 10-19 C h = 6.626 X 10-34 J s c = 3.00 X 10 8 m s -1