MATH CRASH COURSE GRA6020 SPRING 2012

Similar documents
Business Statistics. Lecture 9: Simple Regression

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Algebra. Mathematics Help Sheet. The University of Sydney Business School

Math 5a Reading Assignments for Sections

Simple Regression Model. January 24, 2011

Stat 20 Midterm 1 Review

ABE Math Review Package

Practice Questions for Math 131 Exam # 1

C if U can. Algebra. Name

Algebra Exam. Solutions and Grading Guide

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Chapter 1 Review of Equations and Inequalities

BNAD 276 Lecture 10 Simple Linear Regression Model

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Algebra 1B notes and problems March 12, 2009 Factoring page 1

Review for Final Exam, MATH , Fall 2010

Instructor Notes for Chapters 3 & 4

22 Approximations - the method of least squares (1)

Approximations - the method of least squares (1)

Polynomial Functions. Essential Questions. Module Minute. Key Words. CCGPS Advanced Algebra Polynomial Functions

BIOSTATISTICS NURS 3324

Multiple Representations: Equations to Tables and Graphs Transcript

Chapter 5 Simplifying Formulas and Solving Equations

Appendix A. Review of Basic Mathematical Operations. 22Introduction

Quarter 2 400, , , , , , ,000 50,000

APPENDIX 1 BASIC STATISTICS. Summarizing Data

Math 138: Introduction to solving systems of equations with matrices. The Concept of Balance for Systems of Equations

POWER ALGEBRA NOTES: QUICK & EASY

Module 1 Linear Regression

Plan for Beginning of Year 2: Summer assignment (summative) Cumulative Test Topics 1-4 (IB questions only/no retakes) IA!!

Do not copy, post, or distribute

3.4 Complex Zeros and the Fundamental Theorem of Algebra

MATH 1130 Exam 1 Review Sheet

LECTURE 2: SIMPLE REGRESSION I

ECON 497 Midterm Spring

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

Correlation & Simple Regression

[DOC] GRAPHING LINEAR EQUATIONS WORKSHEET ANSWERS EBOOK

Pre-calculus is the stepping stone for Calculus. It s the final hurdle after all those years of

ACTIVITY 3. Learning Targets: 38 Unit 1 Equations and Inequalities. Solving Inequalities. continued. My Notes

Modeling Prey and Predator Populations

Algebra & Trig Review

Vectors Year 12 Term 1

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II 1 st Nine Weeks,

Finite Mathematics : A Business Approach

Physics Motion Math. (Read objectives on screen.)

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

ES-2 Lecture: More Least-squares Fitting. Spring 2017

LHS Algebra Pre-Test

Using Microsoft Excel

GRE Workshop Quantitative Reasoning. February 13 and 20, 2018

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Exponential Functions

Relationships Between Quantities

#29: Logarithm review May 16, 2009

An Introduction to Matrix Algebra

Finding Limits Graphically and Numerically

What is proof? Lesson 1

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

CLASS NOTES: INTERMEDIATE ALGEBRA AND COORDINATE GEOMETRY

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Quadratic Equations Part I

Polynomials; Add/Subtract

Suppose we have the set of all real numbers, R, and two operations, +, and *. Then the following are assumed to be true.

From: Albert Saiz To: MCP Incoming Class Reference: Working Hard on your Quantitative Skills to make the Most out of the MCP Program at MIT

MITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4

Sometimes the domains X and Z will be the same, so this might be written:

Math (P)Review Part I:

Newton s Cooling Model in Matlab and the Cooling Project!

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

Define the word inequality

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

A-Level Notes CORE 1

Regression with Nonlinear Transformations

Slope Fields: Graphing Solutions Without the Solutions

Mathematics 1104B. Systems of Equations and Inequalities, and Matrices. Study Guide. Text: Mathematics 11. Alexander and Kelly; Addison-Wesley, 1998.

Probability. Kenneth A. Ribet. Math 10A After Election Day, 2016

Pre-Calculus Notes from Week 6

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

SOLVING LINEAR INEQUALITIES

MA 510 ASSIGNMENT SHEET Spring 2009 Text: Vector Calculus, J. Marsden and A. Tromba, fifth edition

Linear Programming and its Extensions Prof. Prabha Shrama Department of Mathematics and Statistics Indian Institute of Technology, Kanpur

How to use these notes

Econ Slides from Lecture 10

Midterm: CS 6375 Spring 2015 Solutions

A supplementary reader

Exploring Graphs of Polynomial Functions

Eigenvalues and eigenvectors

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Lines and Their Equations

8.1 Apply Exponent Properties Involving Products. Learning Outcome To use properties of exponents involving products

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1

determine whether or not this relationship is.

32. SOLVING LINEAR EQUATIONS IN ONE VARIABLE

Functions in Tables 2.0

1 ** The performance objectives highlighted in italics have been identified as core to an Algebra II course.

Logarithmic and Exponential Equations and Change-of-Base

Chapter 1. Foundations of GMAT Math. Arithmetic

Introduction to Linear Regression

Transcription:

MATH CRASH COURSE GRA6020 SPRING 2012 STEFFEN GRØNNEBERG Contents 1. Basic stuff concerning equations and functions 2 2. Sums, with the Greek letter Sigma (Σ) 3 2.1. Why sums are so important to us 3 2.2. Assignments leading up to understanding Σ 4 3. Log, exp and their relations 8 As can be seen from the above table of contents, this note concerns three themes. The first theme, on absolute basics, is partly out-sourced to a book chapter that s uploaded to Its Learning. The second and third sections include some motivating texts and concrete math assignments. I have also uploaded solutions for these problems on Its learning, so that if you re really stuck, it s probably best to just look at the solution as a help. As I said in my first lecture: This is not a math course, but we will use some basic math, and you will need to understand enough to, for example not go blank if you see the formula for covariance. We have started working towards understanding the general SEM-model, which is a really useful and nice extension of linear regression. The covariance matrix is the basic building block of fitting SEMs, and you will need to understand what the covariance matrix is, and how one would go about calculating it. The second section is dedicated to achieving this goal. You will also need to be able to do very basic calculations concerning log and the number e. When we worked with logistic regression, we defined the model in terms of logit(p) = Z, where We showed that this means that logit(p) = log p 1 p. 1 p = 1 + e Z. Understanding how these sizes are calculated is very close to understanding what the above symbols mean, and in order to understand this, you will need to do some basic assignments with log and e if this is not part of your previous math education. The original meaning of a crash course was a learning-by-doing approach when you absolutely needed to learn something right away in emergency situations. For example, if a woman cannot make it to the hospital to have a baby, her husband might be in an emergency situation where he must deliver the baby himself. With Date: March 4, 2012. 1

2 STEFFEN GRØNNEBERG help from a 911-operator, the husband may suddenly find himself taking a crash course in obstetrics. We ll need the above described math right now, and the two hours we ll spend going through these topics together will simply not be enough to replace a good math education. A math education should (and can!) be stimulating, fun and entertaining. Unfortunately, very few would describe their own math education as fun and entertaining, and math classes are often quite boring. However, we have a huge advantage over high school courses: We want to use basic math for some really cool applications. All the themes we have in Gra6020 are extremely useful and are used in core business decisions all over the world. They are also used in important scientific investigations in a variety of fields, including managerial, political and marketing research. Try to keep this in mind when you are struggling with how the summation sign works! In order to be time-effective, we must prioritize going through the absolute essentials, which means focusing on somewhat tedious computations. We do not have the time required to discover the rather surprising joy of doing mathematics. However, I will try as hard as I can to make this crash-course more pleasant than a improvised child birthing! Finally, please skip the assignments that tries to teach you stuff you already know. 1. Basic stuff concerning equations and functions I ve uploaded Appendix B from the book Even you can learn statistics by Levine & Stephan to Its Learning. (A) Do the Assessment Quiz. (B) The answers to the assessment quiz is on the final page. Check out your score, and see if you can figure out how to do those points you didn t get. (C) If you managed all or most of the points, you can now go to the next section of this note. If not, read very carefully through the remaining parts of the appendix (the whole thing!), and pay extra attention to the sections describing points that you did not get. (D) Try to do the quiz again. If there still are questions you cannot get, you must figure out how to do them. Either ask a more math-savvy friend, or pick up a math-book. If you re quite scared of math, try to google for books that you may like and which are written in a friendly manner. If you know Norwegian, I can recommend Mattesprettboka. (E) Check out Kahn Academy (http://www.khanacademy.org/). The site has a rather full section on math tutorials and seems to cover all the stuff we will work with below. Assignment 1. This relation can also be written in as The equation for a straight line is given by y = a + bx. f(x) = a + bx. A function is said to pass through a point (x 1, y 1 ) if f(x 1 ) = y 1. For example, the straight line f(x) = 1 + 2x goes through the point (1, 3). This is because f(1) = 1 + 2 1 = 3. In this example, we have a = 1, b = 2 and x 1 = 1, y 1 = 3.

MATH CRASH COURSE GRA6020 SPRING 2012 3 (A) Does f(x) = 1 + 2x go through the point (0, 1)? (B) Does f(x) = 1 + 2x go through the point ( 5, 11) or ( 5, 9)? (C) A straight line is completely given by two points (try to imagine why this is so). In the two previous questions, you found two points that the line f(x) = 1 + 2x goes through. Plot these two points on a graph, and connect them with a line. In the start of this assignment, I showed that the line goes through the point (1, 3). Plot this point on your graph, and confirm this. (D) In simple linear regression, one fits lines to data to describe linear patterns in datasets with one dependent and one independent variable. Suppose we have used SPSS to fit a straight line to data, but that we only see the line plotted together with the data in a scatter plot. We will here see how we can find the equation for the straight line just by looking at this plot. Suppose that we see that the line goes through ( 1, 4) and (1, 5). Find the equation of the line. (E) Suppose that we see that we instead see that the line goes through ( 1, 0) and (1, 10). Find the equation of the line. (F) The two previous calculations involve unnecessary much work! The steps involved are almost the same, so instead of doing the same steps every time, we can rather solve the problem once and for all. Suppose we see that at x = x 1, the line is at y = y 1 and at x = x 2, the line is at y = y 2. That is, our line must be characterized by the two equations y 1 = a + bx 1 and y 2 = a + bx 2. (1) Show that a and b are given by a = y 1x 2 y 2 x 1 x 2 x 1 og b = y 2 y 1 x 2 x 1. That is, start with the equations in (1), and use the rules of basic algebra in order to find the above values for a and b. (G) Check that you get the same equations for the two lines in (D) and (E) by using the formula you found in (F). 2. Sums, with the Greek letter Sigma (Σ) If the summation sign scares you, you should skip the following subsection, and go directly to Section 2.2. You can then return to Section 2.1 afterwards. 2.1. Why sums are so important to us. The least squares regression line in simple linear regression is the line ^Y i = a + bx i where a and b is the values that makes e 2 i = (Y i ^Y i ) 2

4 STEFFEN GRØNNEBERG the smallest. We will very much care about the so-called covariance matrix in our efforts towards understanding Structural Equation Models. The covariance matrix is defined in terms of different sums of the data, and was given in the fifth lecture. Indeed, in order to find a and b, it would appear that we need to check every possible a,b-combination and check if it makes e 2 i the smallest. However, math saves the day, and shows that the answer is exactly b = r s y s x, a = Ȳ b X. Here, s x and s y are the standard deviations of X 1, X 2,..., X n and Y 1, Y 2,..., Y n respectively. They are given by 1 1 s x = (Xi X) n 1 2, s y = (Yi Ȳ) n 1 2 where X = 1 Xi Ȳ = 1 Yi n n are the averages of the X and Y observations. Also, r is the correlation between X and Y, given by r = 1 Xi X Y i Ȳ n 1 s x s y which is between 1 and 1 and measures the degree and direction of linear dependence. This means that if we know r, s y, s x, Ȳ, X we can put these numbers into the above expressions for a and b. If we do not care about a, we only need to know r, s y, s x. These numbers can be found from the covariance matrix. In path models, the averages and the covariance matrix is also all that is required to estimate the model. Besides the relevance for linear regression, this was our motivation for studying covariance matrices: Lisrel only requires a covariance table (and, if intercepts are to be included, the means) from the data to fit path models (and, it turns out, other SEMs). 2.2. Assignments leading up to understanding Σ. A central operation in statistics is to add numbers. Therefore, adding numbers has its own special symbol, namely Greek letter capital Sigma that is. This symbol means simply add these numbers together. Given numbers X 1, X 2,..., X n, their sum can be written Xi. That is, Xi = X 1 + X 2 + + X n. If one would like to specify exactly how many terms are involved in the sum, we can write n X i, and we again have n X i = X 1 + X 2 + + X n.

MATH CRASH COURSE GRA6020 SPRING 2012 5 Assignment 2. (A) Suppose we have three numbers we would like to add. They are 1, 2, 5. We know that their sum is 1 + 2 + 5 = 3 + 5 = 8. However, the summation symbol can only be expressed in terms of symbols together with some index i. Let s write X 1 = 1 X 2 = 2 X 3 = 5. Use this encoding to write the sum of 1, 2, 5 using the summation symbol. (B) Suppose that all you really want, is to write the sum of 1 and 2. Write this using the summation symbol and the same encoding as the previous point by explicitly stating the number of terms to include. Assignment 3. The summation symbol is often used not directly on the original numbers X 1, X 2,..., X n, but on operations on these numbers. For example, if we want to calculate the sum of the square of some numbers we could either write but we could also simply write X 1, X 2,..., X n, X 2 1 + X 2 2 + X 2 3 + X 2 4 + + X 2 n X 2 i. The formula X 2 i result. (A) Suppose is understood as after squaring all the X-numbers, sum the X 1 = 1 X 2 = 2 X 3 = 5, calculate X 2 i = X 2 1 + X 2 2 + X 2 3. (B) Suppose that X i = i 2. This means that and so on. Find X 1 = 1 2 = 1 X 2 = 2 2 = 4 X 3 = 3 2 = 9 3 X 2 i. (C) We could really do any type of operation inside a summation symbol. For example Xi

6 STEFFEN GRØNNEBERG means after taking the square root of all the X-numbers, sum the result. Suppose that X i = i 2. Find 5 Xi. Assignment 4. The standard deviation of observations X 1, X 2,..., X n has the special symbol s x and is given by the formula s x = 1 n (X i X) n 1 2. It can be calculated as follows: i. First find the average, X. ii. Then calculate (X i X) 2, that is, first subtract the mean from each observation and then square the result. iii. Sum each calculated (X i X) 2 and divide by n 1. The standard deviation is then the square root of this sum. Now suppose X 1 = 1.5, X 2 = 3.5, X 3 = 1.1, X 4 = 0.1. Calculate the standard deviation of these observations by following the above description. Assignment 5. The correlation between two types of observations X 1, X 2,..., X n and Y 1, Y 2,..., Y n has the special symbol r and is given by the formula r = 1 ( ) ( ) X i X Y i Ȳ. n 1 s x s y It can be calculated as follows: i. First find the averages X, Ȳ and the standard deviations s x, s y. ii. Then calculate ( ) ( ) X i X Y i Ȳ s x s y for i = 1, 2,..., n. iii. Sum each calculated ( and ) ( X i X s x ) Y i Ȳ. s y The correlation is then this sum divided by n 1. Now suppose X 1 = 1.5, X 2 = 3.5, X 3 = 1.1, X 4 = 0.1 Y 1 = 2.4, Y 2 = 1.2, Y 3 = 1.5, Y 4 = 2.6. (A) Calculate the correlation between these two sets of observations by following the above description. (B) The covariance between these two sets of observations is given by s x,y = rs x s y. Find the covariance between the X-observations and the Y-observations.

MATH CRASH COURSE GRA6020 SPRING 2012 7 Assignment 6. After a freak accident, the world is in a post-apocalyptic state, and all computers are gone. The manager of a newly founded and somewhat opportunistic black-market company has enslaved you to do tedious calculations. The advertising figures and sales figures for each year since the apocalypse (in 1000) are tabulated below. Year number Advert cost Sales income 1 3.3 18 2 2.5 16.5 3 1.8 14.5 4 4.2 22 5 2.9 19 Table 1. Campaign data (A) Find the average advert cost and the average sales income for all of the five years. (B) Imagine that you were rather given the table below instead of the table with the data. Write down exactly the same expression you started the calculation with in (A), except that you now use symbols instead of numbers. Year number Advert cost Sales income 1 X 1 Y 1 2 X 2 Y 2 3 X 3 Y 3 4 X 4 Y 4 5 X 5 Y 5 Table 2. Campaign data (C) Show that the expression you found in (B) for the average advert cost is the same as X = 1 5 Xi. (D) Find the profit for each year (sales income minus advert cost your boss uses slaves, so he doesn t have any other expenses). Using these numbers, calculate the mean profit for all years. Explain why you could also find the mean profit for all five years by Ȳ X and verify that this indeed becomes the same number. (E) The boss would now like to know the summed income of all the five years, but he wants to hear it directly in pounds, and not in thousands of pounds. Why can you just multiply the number you calculated in (A) by 5000 to find this? (F) How much does the advertising cost and sales income vary? Find the variance of the yearly advertising costs and sales incomes. (G) Find the correlation between advertising cost and sales income. How would you interpret this value?

8 STEFFEN GRØNNEBERG (H) Draw a plot of the data by hand and draw the straight line that you think fits the best to the observations. (I) Your boss is quite simply not satisfied with the line you found, and requires you to find the OLS-line given by ^Y = a + bx where b = r s y, a = Ȳ b X. s x Draw this line on the plot with the data and your manually drawn line. Assignment 7. In this assignment we will show that the following rules are obeyed by the summation-sign. i. For a number a and numbers X 1, X 2,..., X n, we have (ax i ) = a X i. That is, one can move constants within the summation sign to the outside. Here, constants means numbers that do not change with i. ii. For numbers X 1, X 2,..., X n and Y 1, Y 2,..., Y n we have that (X i + Y i ) = ( X i ) + ( Y i ). This is a way of splitting (and also combining!) summation signs. iii. For two numbers a and b, and numbers X 1, X 2,..., X n and Y 1, Y 2,..., Y n, we have that (ax i + by i ) = a X i + b Y i. This is a combination of the two rules above. To illustrate what I want you to do, I will show the first rule here. We know that (axi ) = ax 1 + ax 2 + + ax n. By the basic rules of algebra, we can factorize a outsize of the sum, which gives ax 1 + ax 2 + + ax n = a (X 1 + X 2 +... + X n ). As X 1 + X 2 +... + X n can be written as X i, we conclude that (axi ) = a X i. (A) Show rule (ii), but not with more details than my illustration above. (B) Show rule (iii), but not with more details than my illustration above. Assignment 8. Go through the notes of the fifth lecture, and make sure you understand all the steps when we showed that and s x,y = s y,x. r = s x,y s x s y 3. Log, exp and their relations An good introduction to the exponential function and the logarithm can be found on the book-chapter from Calculus by Freilich & Greenleaf uploaded to Its Learning. Assignment 9. (A) 2 x = 8 (B) 2 x = 32 (C) 4 x = 256 (D) 3 5x 10 = 1 (E) e x = 10 Find (solve for) x for each of the following equations.

MATH CRASH COURSE GRA6020 SPRING 2012 9 (F) e 2x = 20 (G) log x = 1 (H) log 3x = 5 (I) 4 1/x = 2 (J) (4.5) 2x = 18 Assignment 10. A researcher is interested in how the variables GRE (Graduate Record Exam scores), GPA (Grade Point Average) effect admission into a certain graduate school. The response variable, admit/don t admit, is a 0/1-variable, where 1 signifies admission. A logistic regression model was fitted in SPSS using GRE and GPA as covariates. Figure 1 gives part of the SPSS output. Figure 1. SPSS output for the reduced model (A) Find the estimated probability for an individual with GRE = 400 and GPA = 2.6 of getting admission into the graduate school. (B) Find the estimated probability for an individual with GRE = 200 and GPA = 2.6 of getting admission into the graduate school. (C) The best possible GPA is 4.0. Based on the fitted model, is it possible to have at least a 50% admission probability when GRE = 200? (D) How large must GRE be to be able to get at least a 50% admission probability with the best possible GPA? Department of Economics, BI Norwegian School of Management, Nydalsveien 37, Oslo, Norway 0484, Norway E-mail address: Steffen.Gronneberg@bi.no