Ph.D. Preliminary Examination Statistics June, 04 NOTES:. The exam is worth 00 points.. Partial credit may be given for partial answers if possible.. There are 5 pages in this exam paper. I have neither given nor received aid on this examination. Name (print): Student ID : Signature, Date: ===============================================================
. ( points) Suppose X, X, X, X 4, X 6, and X 6 are samples from a standard normal distribution N(0,) (a) Determine a value of c so that the following random variable has a t distribution. c( X X ) (b) Let X X X 4 5 Y = (X + X + X ) +(X 4 + X 5 + X 6 ) Determine a value of d so that the random variable dy will have a distribution.. (5 points) Consider Bernoulli trials whose success probability for each trial is p. Let Xi be the result ( = success; 0 = failure) of the i th trial. Then, the Xi are Bernoulli random variables. Note that the number of successes in n trials can be found by adding the values of X, X,, X n. Regarding it as a summation of n Bernoulli random variables suggests that it can be approximated by Gaussian random variable with the same expected value and variance. Use this approximation to solve the following problem: A student has passed a final exam by supplying correct answers for 0 out of 50 multiple-choice questions. For each question, there was a choice of three possible answers, of which only one was correct. The student claims not to have learned anything in the course and not to have studied for the exam, and says that his correct answers are the product of guesswork. Please determine whether you should believe him.. (5 points) (0 points) The Michigan Department of Transportation wishes to survey state residents to determine what proportion of the population would like to increase one highway speed limits to 75 mph from 65 mph. How many residents do they need to survey if they want to be at least 99% confident that the sample proportion is within 0.05 of the true proportion?
4. (0 points) The monthly production of an auto company is uniformly distributed on the interval 0 to a. The company want to estimate the unknown parameter a by using the observed production of n months. Definition of tighter estimator: estimator ˆθ is considered a tighter estimator for θ than the estimator ˆθ if for any value of θ and any sequence of samples X, X,..., X n, ˆθ - θ ˆθ θ. (a) Please find moment estimator â for parameters a. Is â biased? (b) Please find maximum likelihood methods â for parameters a. Is â biased? (c) Let aˆ 5. Is â tighter than â? Is â tighter than â? (d) Please comments on â, â, and â. Which one is preferred? Why? 5. (5 points) In 976, the Department of Labor conducted a randomized controlled experiment to study the effect of an income supplement on prisoners released from certain prisons in Texas and Georgia. This problem is modelled on that experiment, but with simplified data. The control group received no supplement, and the intervention group received an income equivalent to unemployment insurance. The rates of return to prison for subsequent crimes were similar in the two groups, and the investigators suspected that income supplement might have reduced the amount of time prisoners worked after release. In the first year after release, 6 participants assigned to the treatment group averaged 5 hours of paid work per week, with standard deviation 6.0. The participants assigned to the control group worked on average 6.8 hours per week, with standard deviation.0. Did income support reduce the amount that ex-prisoners worked? Please conduct a hypothesis test procedure to answer the question 6. ( points) An experimenter wants to study the relationship between product life (y) and standardized material strength (x ). He fist consider a liner regression model as follows: y=β 0 +x β +ε The scatterplot below has a dashed vertical line at x, a solid line for the least squares fit, and three points labeled A, B, and C.
Question a) Which point has the largest residual, A, B, or C? Explain. b) Which point has the largest Cook s D, A, B, or C? Explain. c) Which point has the largest leverage, A, B, or C? Explain. Then, the experimenter decides to add another independent variable x (manufacturing processing time) and fit the following second model y=β 0 +x β + x β +ε The experiment obtained the following following results: Predictor Coef SE Coef T P Constant 0.46.5?? X.5-7.5 - X 4.7.6 - - R-Sq =? R-Sq_adj =? Analysis of Variance table: Source DF SS MS F P Regression? 78??? Residual Error??? Total 6 605 a) Please complete the missing values marked as? (ignore the missing values marked as - ) b) Which conclusion can your draw about the significance of the regression. c) Which conclusion can your draw about the relationship of the product life, standardized material strength, and manufacturing processing time. 4
Question. The experimenter collected more data and fit the second model again. Four residual plots are given below. Identify potential problems (can be multiple) from these plots and suggest solutions to each of the problems. 5