bx, which takes in a value of the explanatory variable and spits out the log of the predicted response.

Similar documents
3: Linear Systems. Examples. [1.] Solve. The first equation is in blue; the second is in red. Here's the graph: The solution is ( 0.8,3.4 ).

UNIT 12 ~ More About Regression

Transforming to Achieve Linearity

18.02SC Multivariable Calculus, Fall 2010 Transcript Recitation 34, Integration in Polar Coordinates

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Lesson 6: Algebra. Chapter 2, Video 1: "Variables"

Transforming with Powers and Roots

MITOCW Investigation 4, Part 3

Nonlinear Regression Section 3 Quadratic Modeling

Dialog on Simple Derivatives

MITOCW big_picture_derivatives_512kb-mp4

MITOCW MIT18_02SCF10Rec_61_300k

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

MITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4

Activity Overview. Background. Concepts. Kepler s Third Law

MITOCW ocw feb k

Scatterplots and Correlation

MITOCW watch?v=0usje5vtiks

MITOCW Lec 15 MIT 6.042J Mathematics for Computer Science, Fall 2010

Logarithmic and Exponential Equations and Change-of-Base

Section 3.3. How Can We Predict the Outcome of a Variable? Agresti/Franklin Statistics, 1of 18

Chapter 7 Linear Regression

7.4* General logarithmic and exponential functions

MITOCW ocw f99-lec09_300k

Mean-Motion Resonance and Formation of Kirkwood Gaps

Astronomy Cast Episode 1: Pluto's Planetary Identity Crisis

MITOCW 2. Harmonic Oscillators with Damping

MITOCW watch?v=poho4pztw78

Since the logs have the same base, I can set the arguments equal and solve: x 2 30 = x x 2 x 30 = 0

MITOCW MITRES18_006F10_26_0501_300k-mp4

MITOCW watch?v=vjzv6wjttnc

MITOCW ocw-5_60-s08-lec19_300k

MITOCW ocw f07-lec39_300k

MITOCW Investigation 6, Part 6

MITOCW MITRES18_005S10_DerivOfSinXCosX_300k_512kb-mp4

Astronomy Test Review. 3 rd Grade

6: Polynomials and Polynomial Functions

Astronomy Cast Episode 3: Hot Jupiters and Pulsar Planets

MITOCW MITRES18_005S10_DiffEqnsMotion_300k_512kb-mp4

MITOCW ocw f99-lec01_300k

MITOCW watch?v=ko0vmalkgj8

PROFESSOR: WELCOME BACK TO THE LAST LECTURE OF THE SEMESTER. PLANNING TO DO TODAY WAS FINISH THE BOOK. FINISH SECTION 6.5

MITOCW ocw f99-lec16_300k

MITOCW 18. Quiz Review From Optional Problem Set 8

MITOCW ocw f99-lec23_300k

The topic is a special kind of differential equation, which occurs a lot. It's one in which the right-hand side doesn't

MITOCW R11. Double Pendulum System

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

INFERENCE FOR REGRESSION

Nonlinear Regression Curve Fitting and Regression (Statcrunch) Answers to selected problems

Lecture 48 Sections Mon, Nov 16, 2009

MITOCW watch?v=fxlzy2l1-4w

MITOCW ocw f99-lec30_300k

Chapter 5: Data Transformation

The Inductive Proof Template

IN THE ALMIGHTY GOD NAME Through the Mother of God mediation I do this research

Conditions for Regression Inference:

MITOCW ocw f99-lec05_300k

Regression Analysis: Exploring relationships between variables. Stat 251

Chapter 2: Looking at Data Relationships (Part 3)

The Sun. - this is the visible surface of the Sun. The gases here are very still hot, but much cooler than inside about 6,000 C.

Phys 1120, CAPA #11 solutions.

MITOCW Investigation 3, Part 1

MITOCW MIT18_01SCF10Rec_24_300k

28. SIMPLE LINEAR REGRESSION III

MITOCW watch?v=kkn_dk3e2yw

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test)

USING THE EXCEL CHART WIZARD TO CREATE CURVE FITS (DATA ANALYSIS).

MITOCW watch?v=y6ma-zn4olk

MITOCW 2. Introduction to General Relativity

Modeling Data with Functions

The following formulas related to this topic are provided on the formula sheet:

MITOCW 5. Quantum Mechanics: Free Particle and Particle in 1D Box

ASTROMATH 101: BEGINNING MATHEMATICS IN ASTRONOMY

MITOCW 1. History of Dynamics; Motion in Moving Reference Frames

MITOCW watch?v=pqkyqu11eta

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

12.2 Transforming to Achieve Linearity

MITOCW MIT20_020S09_06_inverter

Note: Please use the actual date you accessed this material in your citation.

I'm not going to tell you what differential equations are, or what modeling is. If you still are uncertain about those

MITOCW watch?v=4x0sggrxdii

MITOCW watch?v=4q0t9c7jotw

Asteroids, Comets, and Meteoroids

Document Guidelines by Daniel Browning

MITOCW ocw f07-lec36_300k

OUR SOLAR SYSTEM. By Kyle Blasi

MITOCW 19. Introduction to Mechanical Vibration

MITOCW watch?v=ws1mx-c2v9w

MITOCW watch?v=vu_of9tcjaa

Business Statistics. Lecture 9: Simple Regression

MITOCW watch?v=hdyabia-dny

MITOCW watch?v=rf5sefhttwo

MITOCW R9. Generalized Forces

DEAN: HEY THERE STARGAZERS, I'M DEAN REGAS, ASTRONOMER FOR THE CINCINNATI OBSERVATORY.

Conceptual Explanations: Modeling Data with Functions

PHYSICS 107. Lecture 3 Numbers and Units

What's Up In Space? In the Center. Around the Sun. Around Earth. Space Facts! Places in Space

Chapter 3: Examining Relationships

Transcription:

Transforming the Data We are focusing on simple linear regression however, not all bivariate relationships are linear. Some are curved we will now look at how to straighten out two large families of curves. The Exponential Transformation Exponential functions are seen quite a bit out in the world well, actually they only do a good job of modeling a part of what we see; in reality, nothing can follow an exactly exponential function. But I digress An exponential function is of the form y = a x. The variable a is called the base, and is typically restricted to be a positive real number. Notice that x is the exponent. Below are a few examples of exponential function graphs. Figure 1 - Exponential Example 1 Figure 2 - Exponential Example 2 The essential feature of an exponential graph is that it rises quite sharply on one end, and flattens out completely on the other. Every point on an exponential function is of the form (x, a x ). The simplest line would have points of the form (x, x). What could we do to change (x, a x ) into (x, x)? How do we get the x out of the exponent? Logarithms! Specifically, take the logarithm of the y-coordinate (the response variable). Now, when I say logarithm, I mean natural logarithm, since that's the most important kind. It is probably the case that you have been taught to use common logarithms which will work, but they're just so common If we take the logarithm of the response variable, then plot the new y-coordinates against the original x-coordinates, then the points will be of the form (x, x ln(a)). This is akin to the graph with coordinates (x, ax), which is a line (with slope a). If this new plot using the transformed response variable looks linear, then we can perform linear regression on it. The resulting regression equation will be ln ( y ˆ) = a+ bx, which takes in a value of the explanatory variable and spits out the log of the predicted response. HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 1 OF 7

For example, perhaps you are told that a regression of y vs. x was performed, and the least squares line is yˆ = 5.218 0. 197x. What is the prediction when x = 10? Plugging in 10 for x produces a value of 3.428, but this isn't the predicted response it's the square root of the predicted response! You've got to square that to get the actual prediction of 10.550. An Example [1.] A classic example the length of a year for a planet, based on its distance from the sun. Here are the data: Table 1 - Orbital Data Distance (millions of miles) Year (# of Earth-years) 36 0.24 67 0.61 93 1 142 1.88 484 11.86 887 29.46 1784 84.07 2796 164.82 3666 247.68 First of all, let's see that the data have a curved relationship. Solar System Year Lengths Year Length (Earth Years) 0 50 100 200 0 1000 2000 3000 Distance from Sol (millions of miles) Figure 5 - Scatterplot of Orbital Data Definitely curved. Don't believe me? Look at the residuals. HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 3 OF 7

Residuals -20-10 0 10 20 Residual Plot 0 50 100 150 200 Figure 6 - Residual Plot of Orbital Data Predicted Year Length Yeah that's bad. But what transformation should be applied? This can't bottom off that would force us to accept negative distances. Thus, the power transformation is indicated. Here is the new plot: Power Transformation Ln(Years) -1 1 2 3 4 5 Figure 7 - Power Transformation Scatterplot 4 5 6 7 8 Ln(Distance) Looks pretty good. Linear, with strong positive association. There appears to be a gap between 5 and 6 (the asteroid belt!). The correlation is 1! 100% of the variation in the natural log of year length can be explained by the linear regression with the natural log of distance. Check that fit! HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 4 OF 7

Power Transformation Ln(Years) -1 0 1 2 3 4 5 4 5 6 7 8 Ln(Distance) Figure 8 - Power Transformation Least Squares Line OK, we'd better check the residuals. One residual plot, coming up. Residual Plot - Power Transformation Residuals -0.001 0.001-1 0 1 2 3 4 5 Ln(Predicted Year) Figure 9 - Power Transformation Residual Plot Nice and random. Now for a check of normality in the residuals. HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 5 OF 7

Residual Distribution Frequency 0 1 2 3 4-0.002-0.001 0.000 0.001 0.002 0.003 Residuals Figure 10 - Histogram of Residuals for Power Transformation Hmmm perhaps I'll look at the normal probability plot, too. Normal Probability Plot Sample Quantiles -0.001 0.001-1.5-1.0-0.5 0.0 0.5 1.0 1.5 Theoretical Quantiles Figure 11 - Normal Probability Plot for Power Transformation Not bad. Looks like we've got a model! ln( yˆ) = 6.8046+ 1.5008 ln( x), where ŷ is the predicted year length, and x is the distance from Sol. So let's use this model to predict the year length of a planet that doesn't exist. The halfway point between Mars and Jupiter is around 313 million miles from Sol. What will this model predict for a year length if a planet occupied this position? Letting x = 313, we get ln( y ˆ) = 6.8046+ 1.5008 ln(313) ln( y ˆ) = 6.8046+ 1.5008 5.7462 ln( y ˆ) = 6.8046+ 8.6238 ln( y ˆ) = 1.8192 HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 6 OF 7

ˆ 8192 1. y = e = 6.167 So our model predict a year length of 6.167 years. (in fact, the asteroid Ceres orbits at a distance of about 418 million miles, and has a period of 4.6 years oops? Well, Ceres isn't a planet). HOLLOMAN S AP STATISTICS APS NOTES 04, PAGE 7 OF 7