ORF 245 Fundamentals of Engineering Statistics. Final Exam

Similar documents
ORF 245 Fundamentals of Engineering Statistics. Final Exam

ORF 245 Fundamentals of Engineering Statistics. Midterm Exam 1

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

Midterm Examination. Mth 136 = Sta 114. Wednesday, 2000 March 8, 2:20 3:35 pm

MAE Probability and Statistical Methods for Engineers - Spring 2016 Final Exam, June 8

ORF 245 Fundamentals of Statistics Practice Final Exam

Sample Problems for the Final Exam

PLEASE MARK YOUR ANSWERS WITH AN X, not a circle! 1. (a) (b) (c) (d) (e) 2. (a) (b) (c) (d) (e) (a) (b) (c) (d) (e) 4. (a) (b) (c) (d) (e)...

CS 361: Probability & Statistics

Math 407: Probability Theory 5/10/ Final exam (11am - 1pm)

Statistics 100 Exam 2 March 8, 2017

MATH 3510: PROBABILITY AND STATS July 1, 2011 FINAL EXAM

MAT 271E Probability and Statistics

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Swarthmore Honors Exam 2012: Statistics

Please do NOT write in this box. Multiple Choice Total

1 Basic continuous random variable problems

Practice Final Examination

375 PU M Sc Statistics

Class time (Please Circle): 11:10am-12:25pm. or 12:45pm-2:00pm

Final Exam Bus 320 Spring 2000 Russell

F79SM STATISTICAL METHODS

18.05 Practice Final Exam

INSTITUTE OF ACTUARIES OF INDIA

Final Exam. Name: Solution:

MAT 271E Probability and Statistics

Math 2000 Practice Final Exam: Homework problems to review. Problem numbers

Exam 2 Practice Questions, 18.05, Spring 2014

Math 493 Final Exam December 01

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

LC OL - Statistics. Types of Data

DISCRETE VARIABLE PROBLEMS ONLY

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Example. If 4 tickets are drawn with replacement from ,

Computer projects for Mathematical Statistics, MA 486. Some practical hints for doing computer projects with MATLAB:

Statistics I Exercises Lesson 3 Academic year 2015/16

1 Basic continuous random variable problems

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name

Problem # Number of points 1 /20 2 /20 3 /20 4 /20 5 /20 6 /20 7 /20 8 /20 Total /150

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

CHAPTER 1. Introduction

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

Statistics & Data Sciences: First Year Prelim Exam May 2018

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

Time: 1 hour 30 minutes

MTH4107 / MTH4207: Introduction to Probability

Business Statistics Midterm Exam Fall 2015 Russell. Please sign here to acknowledge

Practice Problems Section Problems

Name: Exam 2 Solutions. March 13, 2017

STAT 526 Spring Final Exam. Thursday May 5, 2011

November 8th, 2018 Sprint Round Problems 1-30

Math st Homework. First part of Chapter 2. Due Friday, September 17, 1999.

Problem Point Value Points

You may use a calculator. Translation: Show all of your work; use a calculator only to do final calculations and/or to check your work.

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

Final Exam - Solutions

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

CH5 CH6(Sections 1 through 5) Homework Problems

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Final Exam. Math Su10. by Prof. Michael Cap Khoury

UNIVERSITY OF MORATUWA

Ch 13 & 14 - Regression Analysis

MATH 250 / SPRING 2011 SAMPLE QUESTIONS / SET 3

Review. December 4 th, Review

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Class 26: review for final exam 18.05, Spring 2014

(Ans: Q=9.9256; χ(α = 0.05, df = 6) = ; accept hypothesis of independence)

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

SMAM 314 Exam 49 Name. 1.Mark the following statements true or false (10 points-2 each)

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

CS 1538: Introduction to Simulation Homework 1

Find the value of n in order for the player to get an expected return of 9 counters per roll.

You may use your calculator and a single page of notes.

STAT FINAL EXAM

MATH 333: Probability & Statistics. Final Examination (Fall 2004) Must show all work to receive full credit. Total. Score #1 # 2 #3 #4 #5 #6 #7 #8

EECS 70 Discrete Mathematics and Probability Theory Fall 2015 Walrand/Rao Final

are the objects described by a set of data. They may be people, animals or things.

University of Illinois ECE 313: Final Exam Fall 2014

Q1 Own your learning with flash cards.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Math 447. Introduction to Probability and Statistics I. Fall 1998.

Midterm 2 - Solutions

S2 QUESTIONS TAKEN FROM JANUARY 2006, JANUARY 2007, JANUARY 2008, JANUARY 2009

Discrete Random Variables

A random variable is said to have a beta distribution with parameters (a, b) ifits probability density function is equal to

MAT 2377C FINAL EXAM PRACTICE

Math 494: Mathematical Statistics

MATHEMATICS AS/P2/D17 AS PAPER 2

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

CS 361 Sample Midterm 2

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

********************************************************************************************************

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

Content Preview. Multivariate Methods. There are several theoretical stumbling blocks to overcome to develop rating relativities

Page Max. Possible Points Total 100

Part 3: Parametric Models

Note: Solve these papers by yourself This VU Group is not responsible for any solved content. Paper 1. Question No: 3 ( Marks: 1 ) - Please choose one

Transcription:

Princeton University Department of Operations Research and Financial Engineering ORF 245 Fundamentals of Engineering Statistics Final Exam May 22, 2008 7:30pm-10:30pm PLEASE DO NOT TURN THIS PAGE AND START THE EXAM UNTIL YOU ARE TOLD TO DO SO. Instructions: This exam is open book and open notes. Calculators are allowed, but not computers or the use of statistical software packages. Write all your work in the space provided after each question. There are questions on both sides of each page. Explain as thoroughly and as clearly as possible all your steps in answering each question. Full or partial credit can only be granted if intermediate steps are clearly indicated. Name: Pledge: I pledge my honor that I have not violated the honor code during this examination. Signature: 1: (12) 6: (15) 11: (12) 2: (06) 7: (20) 12: (10) 3: (10) 8: (10) 13: (10) 4: (05) 9: (20) 14: (12) 5: (05) 10: (08) 15: (20) Total: (175)

Descriptive Statistics: 2 1) Let x n and s n denote the sample mean and variance for the sample x1,..., x n and let 2 x n + 1 and s n + 1 denote these quantities when an additional observation x n + 1 is added to the sample. a) (4 pts.) Show how x n + 1 can be computed from x n and x n + 1. b) (8 pts.) Show that 2 2 n ns ( ) 2 n+ 1 = ( n 1) sn + xn+ 1 x n n + 1 2 2 so that s n + 1 can be computed from x n + 1, x n, and s n. 2

2) Consider the following histogram that shows the time in months that articles submitted to a certain scientific journal in 2002 took to be reviewed for publication. a) (3 pts.) Which class interval contains the median review time? b) (3 pts.) Which class interval contains the third quartile of the review times? 3

Probability: 3) Items are inspected for flaws by two quality inspectors. If a flaw is present, it will be detected by the first inspector with probability 0.9, and by the second inspector with probability 0.7. Assume that the inspectors function independently. a) (4 pts.) If an item has a flaw, what is the probability that it will be found by at least one of the inspectors? b) (6 pts.) Assume that both inspectors inspect every item and that if an item has no flaw, then neither inspector will detect a flaw. Assume also that the probability that an item has a flaw is 0.10. If an item is passed by both inspectors, what is the probability that it actually has a flaw? 4

4) (5 pts.) An urn contains 3 red balls and 7 black balls. Players A and B withdraw balls from the urn consecutively until a red ball is selected. Namely, A draws the first ball, then B draws the second one, then A again, and so on, until the first one of them draws a red ball. If there is no replacement of the drawn balls, find the probability that A selects the red ball. Random Variables: 5) (5 pts.) Two types of coins are produced at a factory: a fair coin and a biased one that comes up heads 55 percent of the time. We have a coin from this factory but do not know whether it is a fair coin or a biased one. In order to ascertain which type of coin we have, we will perform the following statistical test: we will toss the coin 1000 times. If the coin lands on heads 525 or more times, then we will conclude that it is a biased coin, whereas, if it lands heads less than 525 times, then we will conclude that it is the fair coin. If the coin is actually fair, what is the probability that we will reach a false conclusion? [Hint: use the Normal approximation with continuity correction.] 5

6) (15 pts.) A bus travels between two cities A and B, which are 100 miles apart. If the bus has a breakdown, the distance from the breakdown to city A has a uniform distribution over (0, 100). There is a bus service station in city A, in B, and in the center of the route between A and B. It is suggested that it would be more efficient to have the three stations located 25, 50, and 75 miles, respectively, from A. Do you agree? Why? [Hint: compare the expected distance that the bus would have to be towed, from the breakdown point to the nearest service station.] 6

Joint Probability Distributions: 7) Choose a number X at random from the set of numbers { 1,2,3,4,5 }. Now choose a number at random from the subset no larger than X, that is, from { 1,..., X }. Call this second number Y. a) (10 pts.) Find the joint probability mass function of X and Y. b) (7 pts.) Find the expected value and the variance of Y. c) (3 pts.) Are X and Y independent? Explain. 7

Statistical Estimation: 8) (10 pts.) Maximum likelihood estimates possess the property of functional invariance, which means that if ˆ θ is the MLE of θ, and h( θ ) is any function of θ, then h( ˆ θ ) is the MLE of h( θ ). Given a random sample X1,..., X n from a geometric distribution with parameter p, find the MLE of the odds ratio p ( 1 p). 8

Confidence Intervals: 9) Let X represent the number of events that are observed to occur in n units of time or space, and assume that X Poisson nλ, where λ is the mean number of events that ( ) occur in one unit of time or space. Assume that is large, so that X N nλ, nλ. A suitable estimator of λ is given by ˆ λ = X n, with standard error SE( ˆ λ) = λ n. a) (4 pts.) Assuming that X is large, what is the distribution of ˆλ? (Name the distribution and tell the values of its parameters.) X ( ) b) (4 pts.) Use the distribution found in the previous item and the fact that SE ( ˆ λ) ˆ λ n to derive an expression for the 100(1 α ) % confidence interval for λ. c) (4 pts.) A 5 ml sample of a certain suspension is found to contain 300 particles. The mean number of particles per ml in the suspension is, give or take. d) (4 pts.) After 4 minutes, a geologist counted 256 particles emitted from a certain radioactive rock. Find a 95% confidence interval for the rate of emissions in units of particles per minute. 9

e) (4 pts.) For how many minutes should particles be counted so that the 95% confidence interval specifies the rate to within ± 1 particle per minute? 10) A sample of seven concrete blocks had their compressive strength measured in MPa. The results were 1367.6, 1411.5, 1318.7, 1193.6, 1406.2, 1425.7, and 1572.4. Ten thousand bootstrap samples were generated from these data, and the bootstrap sample means were arranged in order. Refer to the smallest mean as Y 1, the second smallest as Y2, and so on, with the largest being Y10000. Assume that Y 50 = 1283.4, Y 51 = 1283.4, Y 100 = 1291.5, Y 101 = 1291.5, Y 250 = 1305.5, Y 251 = 1305.5, Y 500 = 1318.5, Y 501 = 1318.5, Y 9500 = 1449.7, Y 9501 = 1449.7, Y 9750 = 1462.1, Y 9751 = 1462.1, Y 9900 = 1476.2, Y 9901 = 1476.2, Y 9950 = 1483.8, and Y 9951 = 1483.8. a) (4 pts.) Compute the 95% bootstrap confidence interval for the mean compressive strength. b) (4 pts.) Was this a parametric or a nonparametric bootstrap procedure? Explain. 10

Tests of Hypothesis: 11) An article by Abdel-Aty et al. in the Journal of Transportation Engineering presents a tabulation of types of car crashes by the age of the driver over a three-year period in Florida. Here is the table: Age of drivers 15-24 years 25-64 years Total # of accidents 82,486 219,170 # of accidents in driveways 4,243 10,701 a) (4 pts.) The difference between the proportions of driveway accidents for drivers aged 15-24 and drivers aged 25-64 is %, give or take %. b) (4 pts.) Can you conclude that driveway accidents among 15-24 year-olds in FL are indeed likely to be proportionately higher than driveway accidents among 25-64 year-old Floridians? State the hypotheses clearly and answer this question using the P-value. c) (4 pts.) Assuming that young drivers in Florida do present a higher proportion of driveway accidents than older drivers, does this mean that younger Floridian drivers should be required to take a special course on how to drive on driveways, but not older drivers? Explain. 11

12) An engineer claims that a new type of hard disk for laptops lasts longer than the old type. Independent random samples of 75 of each of the two types are chosen, and the sample means and standard deviations of their lifetimes are computed: New: X 1 = 4387 h s 1 = 252 h Old: X 2 = 4260 h s 2 = 231 h a) (4 pts.) Can you conclude that the mean lifetime of new hard disks is greater than that of the old hard disks? State the hypotheses clearly and answer this question at the 1% significance level. b) (4 pts.) If the new hard disks have indeed a mean lifetime 40 h longer than the old ones, what is the probability ( β ) that the test performed in the previous item will incur into error of type II (that is, failing to reject )? H 0 c) (2 pts.) Recompute the probability of error type II for the case of the new hard disks having a mean lifetime 80 h longer than the old ones. 12

Correlation and Linear Regression: 13) A chemical engineer is studying the effect of temperature and stirring rate on the yield of a certain product. The process is run 16 times, at the settings indicated in the following table. The units for yield are percent of a theoretical maximum. The matrix of sample correlation coefficients among the variables in question is as follows: a) (5 pts.) Based on the analysis of sample correlation above, would you try and fit a multiple linear regression model in which the yield is the response variable and temperature and stirring rates are the covariates? Explain. 13

b) (5 pts.) Find the 95% confidence interval for the coefficient of correlation between the stirring rate and the yield. What assumptions did you make in order to compute this confidence interval? 14) The chemical engineer from the previous question has decided to calibrate a simple linear regression model with the yield as the response variable ( Y ) and stirring rate as the covariate ( X ). The results of the calibration obtained through Excel are: a) (2 pts.) What proportion of the observed variation in yield can be attributed to the simple linear regression relationship between yield and stirring rate? b) (5 pts.) Can you say that an increase of 10 rpm in the stirring rate will produce an increase in yield of at least 2%? State the hypotheses clearly and answer this question at the 5% significance level. 14

c) (5 pts.) Construct the 95% confidence interval for the prediction of the yield percentage that corresponds to a stirring rate of 55 rpm. In order to compute this interval, you may need the following additional information: 15

Multiple Linear Regression: 15) A study was made in which data was obtained to relate y = specific surface area 3 ( cm /g ) to x 1 = % NaOH used as a pretreatment chemical and x 2 = treatment time (min) for a batch of pulp. The following R output resulted from a request to fit the Y = β + β x + β x +ε. model 0 1 1 2 2 a) (6 pts.) Fill in the blanks in the tables above by computing the following values: the coefficients of determination regular and adjusted, the regression sum of squares, the mean sums of squares regression and residuals, and the value of the F statistics. Show your computations. b) (2 pts.) What proportion of observed variation in specific surface area can be explained by the model relationship? 16

c) (4 pts.) Does the chosen model appear to specify a useful relationship between the response and the covariates? Explain. d) (4 pts.) Provided that % NaOH remains in the model, would you suggest that the covariate treatment time be eliminated? Explain. e) (4 pts.) Calculate a 95% confidence interval for the expected change in specific surface area associated with an increase of 1 % in NaOH when treatment time is held fixed. 17