Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24

Similar documents
Stat 20 Midterm 1 Review

Chapter 6 The Normal Distribution

Mrs. Poyner/Mr. Page Chapter 3 page 1

Section 5.4. Ken Ueda

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Algebra 2 CP Semester 1 PRACTICE Exam

FREQUENCY DISTRIBUTIONS AND PERCENTILES

4.2 The Normal Distribution. that is, a graph of the measurement looks like the familiar symmetrical, bell-shaped

Math 2311 Sections 4.1, 4.2 and 4.3

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

Density Curves and the Normal Distributions. Histogram: 10 groups

1) A residual plot: A)

6.6 General Form of the Equation for a Linear Relation

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values.

Simple Linear Regression Using Ordinary Least Squares

Unit 4 Probability. Dr Mahmoud Alhussami

Ch Inference for Linear Regression

Section 2.5 Absolute Value Functions

Lecture 18: Simple Linear Regression

MATH Spring 2010 Topics per Section

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

AMS 7 Correlation and Regression Lecture 8

appstats8.notebook October 11, 2016

Continuous Probability Distributions

Describing Distributions

Absolute Value Functions

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

MATH 1150 Chapter 2 Notation and Terminology

Announcements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory

University of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012

Statistics 100 Exam 2 March 8, 2017

Math 147 Lecture Notes: Lecture 12

Lecture 30. DATA 8 Summer Regression Inference

Revised: 2/19/09 Unit 1 Pre-Algebra Concepts and Operations Review

Unit 5: Regression. Marius Ionescu 09/22/2011

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Continuous random variables

Linear Regression. Udacity

CURRICULUM CATALOG. Algebra II (3135) VA

Describing Distributions With Numbers

Unit 6 - Introduction to linear regression

Regression Analysis: Exploring relationships between variables. Stat 251

Looking at data: distributions - Density curves and Normal distributions. Copyright Brigitte Baldi 2005 Modified by R. Gordon 2009.

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Cumulative Frequency & Frequency Density

Statistics Lecture 3

Stat 101: Lecture 6. Summer 2006

Determining the Spread of a Distribution

Chapter 5. Understanding and Comparing. Distributions

Determining the Spread of a Distribution

Topics Covered in Math 115

Quoting from the document I suggested you read ( westfall/images/5349/practiceproblems_discussion.

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Scope and Sequence Mathematics Algebra 2 400

MA.8.1 Students will apply properties of the real number system to simplify algebraic expressions and solve linear equations.

Algebra 2 C Midterm Exam Review Topics Exam Date: A2: Wednesday, January 21 st A4: Friday, January 23 rd

The Normal Distribution. Chapter 6

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

Chapter 6: Exploring Data: Relationships Lesson Plan

Pre-Calculus Multiple Choice Questions - Chapter S8

MATH 0960 ELEMENTARY ALGEBRA FOR COLLEGE STUDENTS (8 TH EDITION) BY ANGEL & RUNDE Course Outline

Evaluate the expression if x = 2 and y = 5 6x 2y Original problem Substitute the values given into the expression and multiply

36-309/749 Math Review 2014

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

Algebra I+ Pacing Guide. Days Units Notes Chapter 1 ( , )

Exponential and Logarithmic Functions. Copyright Cengage Learning. All rights reserved.

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables

Final Exam - Solutions

Answers to Sample Exam Problems

Chapter 2: Looking at Data Relationships (Part 3)

EPPING HIGH SCHOOL ALGEBRA 2 COURSE SYLLABUS

Unit 6 - Simple linear regression

EQ: What is a normal distribution?

ALGEBRAIC PRINCIPLES

Appendix 1. KYOTE College Readiness Placement Exam Standards

Practice IAD - Form C

NORMAL CURVE STANDARD SCORES AND THE NORMAL CURVE AREA UNDER THE NORMAL CURVE AREA UNDER THE NORMAL CURVE 9/11/2013

Spring 2015 Midterm 1 03/04/15 Lecturer: Jesse Gell-Redman

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

The empirical ( ) rule

HOSTOS COMMUNITY COLLEGE DEPARTMENT OF MATHEMATICS. MAT 010 or placement on the COMPASS/CMAT

Chapter 7. Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop

MATH 0409: Foundations of Mathematics COURSE OUTLINE

NEW YORK ALGEBRA TABLE OF CONTENTS

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

CURRICULUM CATALOG. Algebra I (3130) VA

Math 140 Introductory Statistics

Math 140 Introductory Statistics

value mean standard deviation

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

HOSTOS COMMUNITY COLLEGE DEPARTMENT OF MATHEMATICS

ALGEBRA I - Pacing Guide Glencoe Algebra 1 North Carolina Edition

Lecture 2 and Lecture 3

Chapter 2 Statistics. Mean, Median, Mode, and Range Definitions

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Big Data Analysis with Apache Spark UC#BERKELEY

Transcription:

Midterm Exam Midterm Review AMS-UCSC May 6th, 2015 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 1 / 24

Topics Topics We will talk about... 1 Review Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 2 / 24

The histogram Drawing a Histogram Once the distribution table is available the next step is to draw a horizontal axis specifying the class intervals. Then we draw the blocks remembering that: In a histogram, the areas of the blocks represent percentages When class intervals do not have the same length, it is a mistake to set the heights of the blocks equal to the percentages in the table. To figure out the height of a block divide the percentage by the length of the interval. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 3 / 24

The histogram Vertical Scale The meaning of the vertical scale in a histogram Remember that the area of the blocks is proportional to the percents. A high height implies that large chunks of area accumulate in small portions of the horizontal scale. This implies that the density of the data is high in the intervals where the height is large. In other words, the data are more crowded in those intervals. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 4 / 24

Average and Standard Deviation Average and SD Average The average of a list of numbers equals their sum, divided by how many they are The Standard Deviation (SD) The SD of a list of numbers measures how far away they are from their average Thus a large SD implies that many observations are far from the overall average. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 5 / 24

Average and Standard Deviation The Standard Deviation We can quantify what is written above as Roughly 68% of the observations are within one SD of the average. Roughly 95% of the observations are within two SDs of the average. Roughly 99% of the observations are within three SDs of the average. These statements are more accurate when the distribution is symmetric. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 6 / 24

The Normal Density The Normal Density The Gaussian or normal curve corresponds to the following formula y = 1 2π e x2 /2 e = 2.71828... and corresponds to the graph The area below the curve is equal to one. We observe that the curve is symmetric around zero and that most of the area is concentrated between 4 and 4. The probability of an interval is the corresponding area under the curve. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 7 / 24

The Normal Density The Normal Density Doing calculations with the normal curve requires the use of a table. Tables are available for the standard normal curve and they require that observations be transformed to standard units. Standard Units Given a list of numbers, we convert to standard units by subtracting the average and dividing by the SD P((0, z)) = 1/2 P(( z, z)) P(( z, x)) = P(( z, 0)) + P((0, x)) P(> z) = 1/2 (P(< z) + P(> z)) P(< z) + P(> z) = 1 P(( z, z)) P(< z) = P(< 0) + P((0, z)) P((z, x)) = 1/2 (P(( x, x)) P(( z, z)) Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 8 / 24

Correlation Review Correlation Correlation Coefficient The correlation coefficient gives a measure of the linear association of two variables. The correlation coefficient is usually denoted by r and takes values between -1 and 1 The correlation is not affected when the two variables are interchanged. The correlation is not changed if the same number is added to all the values of one of the variables. The correlation is not changed if all the values of one of the variables is multiplied by the same positive number. It will change sign if the number is negative. The correlation coefficient is 1 if the variables have perfect positive linear association and -1 is they have perfect negative linear association. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 9 / 24

Correlation Correlation Computing the correlation coefficient The procedure to compute the correlation coefficients is the following 1 Convert each variable to standard units 2 Calculate the average of the products The result is the correlation coefficient. The formula is given by r = average of ( x in standard units y in standard units ) Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 10 / 24

Regression Regression The Regression Line The regression line for y on x estimates the average value of y corresponding to each value of x Associated with an increase of one SD in x there is an increase of r SDs in y on average. error = actual value of y - predicted value of y RMS error = 1 r 2 SD of y Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 11 / 24

Regression Regression Estimate Percentile Ranks We can use the regression method and the normal curve to produce estimates of the percentile ranks. Percentile Rank A percentile is a score: for example the 95th percentile is a score of 700. A percentile rank is the percent: if you score 700, you have a percentile rank of 95%. Given a percentile rank for the x variable, find the corresponding z score in the normal table. This score gives the number of SDs above the average of the x variable. Using the regression method find the SDs above the average of the y variable. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 12 / 24

Regression Regression Regression The average of the residuals is 0 and the regression plot for the residuals is horizontal The formula for the slope of a regression line is r SD of y SD of x Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 13 / 24

Regression Review Regression The intercept of the regression line is the predicted value of y for x = 0. The intercept formula is given by average of y slope average of x Among all possible lines through a cloud, the regression line is the one that has the smallest RMS error in predicting y from x. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 14 / 24

Problems Problem 1 Among freshmen at a certain university, scores of the Math SAT followed the normal curve, with an average of 550 and a SD of 100. Find he percentile corresponding to a score of 400 on the Math SAT. Find the score corresponding to the 75th percentile of the distribution Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 15 / 24

Problem 1 (Cont.) Review Problems a) First calculate the standard units for this score: (400-550)/100=-1.5. 400 is 1.5 SDs below average. This student is in the 7th percentile of the score distribution. The area to the left of -1.5 is about 7%. b) The 75th percentile is around 0.7. The student needs about 0.7 SDs above the average. This is about 550+0.7*100=620 on the Math SAT exam. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 16 / 24

Problems Problem 2 A statistical analysis is made of the midterm and final scores in a large class. The results are average midterm score 60, SD 15 average final score 65, SD 20, r 0.50 1 Using the normal approximation, about what percentage of the students scored over 80 on the midterm? 80 points on the midterm corresponds to 80 60 15 = 1.33 standard units. Using the normal we obtain that approximately 9% of the students scored over 80 on the midterm. 2 What is the R.M.S. error? 1.5 2 20 = 17.32 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 17 / 24

Problem 2(Cont.) Review Problems 1 What is the slope of the regression line? 0.5 20 15 = 0.67 2 What is the predicted final score for a student who scored 80 in the midterm? 80 points on the midterm is 1.33 SD units above average. This corresponds to 1.33 0.5 = 0.67 SD above average on the final. That corresponds to 0.67 20 = 13.4 points over average on the final, so the students that scored 80 on the midterm, scored, on average, 65 + 13.4 = 78.4 on the final. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 18 / 24

Problem 2(Cont.) Review Problems 1 Of the students who scored 80 on the midterm, about what percentage scored over 80 on the final? In standard units we have 80 78.4 17.32 = 0.09 and there is an area of about 46% to the right of this value under the normal curve. Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 19 / 24

Problems Problem 3(Problem 1b) Chp. 10. Sect. C) Average of Midterm exam 60 SD of Midterm exam 15 Average of Final exam 60 SD of Final exam 15 r = 0.5 Predict final exam score for a student whose Midterm score is 30 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 20 / 24

Problems Problem 3(Problem 1b) Chp. 10. Sect. C (Cont.)) 1 Get standard units for x = 30 (x is midterm score) z = (30 60)/15 = 2 2 Get standard units in y using the regression method (y is the final score): 2 r = 2 0.5 = 1 3 Get final standard units in y 1 15 = 1.5 The students score in the final is 15 points below the average. 4 Final score: 60 15 = 45 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 21 / 24

Problems Problem 4(Problem 2b) Chp. 10. Sect. C) The correlation between the SAT scores and the 1st year GPA scores is r 0.60. A student got a Percentile Rank on SAT of 30%. Predict the corresponding Percentile Rank of the 1st year GPA exam Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 22 / 24

Problems Problem 4(Problem 2b) Chp. 10. Sect. C) 1 You need the z score corresponding to an area of 30% to the left of this value. This is equivalent to the z value of an area of 40% in the normal table. z = 0.53 2 Use the regression method to predict standard units: 0.53 0.60 = 0.318 3 The area to the left of this value will be the predicted percentile rank This is about 38%. (1 0.25)/2 = 0.75/2 = 0.378 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 23 / 24

Problems Good luck in your midterm exam! Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 24 / 24