MATH CRASH COURSE GRA6020 SPRING 2012 STEFFEN GRØNNEBERG Contents 1. Basic stuff concerning equations and functions 2 2. Sums, with the Greek letter Sigma (Σ) 3 2.1. Why sums are so important to us 3 2.2. Assignments leading up to understanding Σ 4 3. Log, exp and their relations 8 As can be seen from the above table of contents, this note concerns three themes. The first theme, on absolute basics, is partly out-sourced to a book chapter that s uploaded to Its Learning. The second and third sections include some motivating texts and concrete math assignments. I have also uploaded solutions for these problems on Its learning, so that if you re really stuck, it s probably best to just look at the solution as a help. As I said in my first lecture: This is not a math course, but we will use some basic math, and you will need to understand enough to, for example not go blank if you see the formula for covariance. We have started working towards understanding the general SEM-model, which is a really useful and nice extension of linear regression. The covariance matrix is the basic building block of fitting SEMs, and you will need to understand what the covariance matrix is, and how one would go about calculating it. The second section is dedicated to achieving this goal. You will also need to be able to do very basic calculations concerning log and the number e. When we worked with logistic regression, we defined the model in terms of logit(p) = Z, where We showed that this means that logit(p) = log p 1 p. 1 p = 1 + e Z. Understanding how these sizes are calculated is very close to understanding what the above symbols mean, and in order to understand this, you will need to do some basic assignments with log and e if this is not part of your previous math education. The original meaning of a crash course was a learning-by-doing approach when you absolutely needed to learn something right away in emergency situations. For example, if a woman cannot make it to the hospital to have a baby, her husband might be in an emergency situation where he must deliver the baby himself. With Date: March 4, 2012. 1
2 STEFFEN GRØNNEBERG help from a 911-operator, the husband may suddenly find himself taking a crash course in obstetrics. We ll need the above described math right now, and the two hours we ll spend going through these topics together will simply not be enough to replace a good math education. A math education should (and can!) be stimulating, fun and entertaining. Unfortunately, very few would describe their own math education as fun and entertaining, and math classes are often quite boring. However, we have a huge advantage over high school courses: We want to use basic math for some really cool applications. All the themes we have in Gra6020 are extremely useful and are used in core business decisions all over the world. They are also used in important scientific investigations in a variety of fields, including managerial, political and marketing research. Try to keep this in mind when you are struggling with how the summation sign works! In order to be time-effective, we must prioritize going through the absolute essentials, which means focusing on somewhat tedious computations. We do not have the time required to discover the rather surprising joy of doing mathematics. However, I will try as hard as I can to make this crash-course more pleasant than a improvised child birthing! Finally, please skip the assignments that tries to teach you stuff you already know. 1. Basic stuff concerning equations and functions I ve uploaded Appendix B from the book Even you can learn statistics by Levine & Stephan to Its Learning. (A) Do the Assessment Quiz. (B) The answers to the assessment quiz is on the final page. Check out your score, and see if you can figure out how to do those points you didn t get. (C) If you managed all or most of the points, you can now go to the next section of this note. If not, read very carefully through the remaining parts of the appendix (the whole thing!), and pay extra attention to the sections describing points that you did not get. (D) Try to do the quiz again. If there still are questions you cannot get, you must figure out how to do them. Either ask a more math-savvy friend, or pick up a math-book. If you re quite scared of math, try to google for books that you may like and which are written in a friendly manner. If you know Norwegian, I can recommend Mattesprettboka. (E) Check out Kahn Academy (http://www.khanacademy.org/). The site has a rather full section on math tutorials and seems to cover all the stuff we will work with below. Assignment 1. This relation can also be written in as The equation for a straight line is given by y = a + bx. f(x) = a + bx. A function is said to pass through a point (x 1, y 1 ) if f(x 1 ) = y 1. For example, the straight line f(x) = 1 + 2x goes through the point (1, 3). This is because f(1) = 1 + 2 1 = 3. In this example, we have a = 1, b = 2 and x 1 = 1, y 1 = 3.
MATH CRASH COURSE GRA6020 SPRING 2012 3 (A) Does f(x) = 1 + 2x go through the point (0, 1)? (B) Does f(x) = 1 + 2x go through the point ( 5, 11) or ( 5, 9)? (C) A straight line is completely given by two points (try to imagine why this is so). In the two previous questions, you found two points that the line f(x) = 1 + 2x goes through. Plot these two points on a graph, and connect them with a line. In the start of this assignment, I showed that the line goes through the point (1, 3). Plot this point on your graph, and confirm this. (D) In simple linear regression, one fits lines to data to describe linear patterns in datasets with one dependent and one independent variable. Suppose we have used SPSS to fit a straight line to data, but that we only see the line plotted together with the data in a scatter plot. We will here see how we can find the equation for the straight line just by looking at this plot. Suppose that we see that the line goes through ( 1, 4) and (1, 5). Find the equation of the line. (E) Suppose that we see that we instead see that the line goes through ( 1, 0) and (1, 10). Find the equation of the line. (F) The two previous calculations involve unnecessary much work! The steps involved are almost the same, so instead of doing the same steps every time, we can rather solve the problem once and for all. Suppose we see that at x = x 1, the line is at y = y 1 and at x = x 2, the line is at y = y 2. That is, our line must be characterized by the two equations y 1 = a + bx 1 and y 2 = a + bx 2. (1) Show that a and b are given by a = y 1x 2 y 2 x 1 x 2 x 1 og b = y 2 y 1 x 2 x 1. That is, start with the equations in (1), and use the rules of basic algebra in order to find the above values for a and b. (G) Check that you get the same equations for the two lines in (D) and (E) by using the formula you found in (F). 2. Sums, with the Greek letter Sigma (Σ) If the summation sign scares you, you should skip the following subsection, and go directly to Section 2.2. You can then return to Section 2.1 afterwards. 2.1. Why sums are so important to us. The least squares regression line in simple linear regression is the line ^Y i = a + bx i where a and b is the values that makes e 2 i = (Y i ^Y i ) 2
4 STEFFEN GRØNNEBERG the smallest. We will very much care about the so-called covariance matrix in our efforts towards understanding Structural Equation Models. The covariance matrix is defined in terms of different sums of the data, and was given in the fifth lecture. Indeed, in order to find a and b, it would appear that we need to check every possible a,b-combination and check if it makes e 2 i the smallest. However, math saves the day, and shows that the answer is exactly b = r s y s x, a = Ȳ b X. Here, s x and s y are the standard deviations of X 1, X 2,..., X n and Y 1, Y 2,..., Y n respectively. They are given by 1 1 s x = (Xi X) n 1 2, s y = (Yi Ȳ) n 1 2 where X = 1 Xi Ȳ = 1 Yi n n are the averages of the X and Y observations. Also, r is the correlation between X and Y, given by r = 1 Xi X Y i Ȳ n 1 s x s y which is between 1 and 1 and measures the degree and direction of linear dependence. This means that if we know r, s y, s x, Ȳ, X we can put these numbers into the above expressions for a and b. If we do not care about a, we only need to know r, s y, s x. These numbers can be found from the covariance matrix. In path models, the averages and the covariance matrix is also all that is required to estimate the model. Besides the relevance for linear regression, this was our motivation for studying covariance matrices: Lisrel only requires a covariance table (and, if intercepts are to be included, the means) from the data to fit path models (and, it turns out, other SEMs). 2.2. Assignments leading up to understanding Σ. A central operation in statistics is to add numbers. Therefore, adding numbers has its own special symbol, namely Greek letter capital Sigma that is. This symbol means simply add these numbers together. Given numbers X 1, X 2,..., X n, their sum can be written Xi. That is, Xi = X 1 + X 2 + + X n. If one would like to specify exactly how many terms are involved in the sum, we can write n X i, and we again have n X i = X 1 + X 2 + + X n.
MATH CRASH COURSE GRA6020 SPRING 2012 5 Assignment 2. (A) Suppose we have three numbers we would like to add. They are 1, 2, 5. We know that their sum is 1 + 2 + 5 = 3 + 5 = 8. However, the summation symbol can only be expressed in terms of symbols together with some index i. Let s write X 1 = 1 X 2 = 2 X 3 = 5. Use this encoding to write the sum of 1, 2, 5 using the summation symbol. (B) Suppose that all you really want, is to write the sum of 1 and 2. Write this using the summation symbol and the same encoding as the previous point by explicitly stating the number of terms to include. Assignment 3. The summation symbol is often used not directly on the original numbers X 1, X 2,..., X n, but on operations on these numbers. For example, if we want to calculate the sum of the square of some numbers we could either write but we could also simply write X 1, X 2,..., X n, X 2 1 + X 2 2 + X 2 3 + X 2 4 + + X 2 n X 2 i. The formula X 2 i result. (A) Suppose is understood as after squaring all the X-numbers, sum the X 1 = 1 X 2 = 2 X 3 = 5, calculate X 2 i = X 2 1 + X 2 2 + X 2 3. (B) Suppose that X i = i 2. This means that and so on. Find X 1 = 1 2 = 1 X 2 = 2 2 = 4 X 3 = 3 2 = 9 3 X 2 i. (C) We could really do any type of operation inside a summation symbol. For example Xi
6 STEFFEN GRØNNEBERG means after taking the square root of all the X-numbers, sum the result. Suppose that X i = i 2. Find 5 Xi. Assignment 4. The standard deviation of observations X 1, X 2,..., X n has the special symbol s x and is given by the formula s x = 1 n (X i X) n 1 2. It can be calculated as follows: i. First find the average, X. ii. Then calculate (X i X) 2, that is, first subtract the mean from each observation and then square the result. iii. Sum each calculated (X i X) 2 and divide by n 1. The standard deviation is then the square root of this sum. Now suppose X 1 = 1.5, X 2 = 3.5, X 3 = 1.1, X 4 = 0.1. Calculate the standard deviation of these observations by following the above description. Assignment 5. The correlation between two types of observations X 1, X 2,..., X n and Y 1, Y 2,..., Y n has the special symbol r and is given by the formula r = 1 ( ) ( ) X i X Y i Ȳ. n 1 s x s y It can be calculated as follows: i. First find the averages X, Ȳ and the standard deviations s x, s y. ii. Then calculate ( ) ( ) X i X Y i Ȳ s x s y for i = 1, 2,..., n. iii. Sum each calculated ( and ) ( X i X s x ) Y i Ȳ. s y The correlation is then this sum divided by n 1. Now suppose X 1 = 1.5, X 2 = 3.5, X 3 = 1.1, X 4 = 0.1 Y 1 = 2.4, Y 2 = 1.2, Y 3 = 1.5, Y 4 = 2.6. (A) Calculate the correlation between these two sets of observations by following the above description. (B) The covariance between these two sets of observations is given by s x,y = rs x s y. Find the covariance between the X-observations and the Y-observations.
MATH CRASH COURSE GRA6020 SPRING 2012 7 Assignment 6. After a freak accident, the world is in a post-apocalyptic state, and all computers are gone. The manager of a newly founded and somewhat opportunistic black-market company has enslaved you to do tedious calculations. The advertising figures and sales figures for each year since the apocalypse (in 1000) are tabulated below. Year number Advert cost Sales income 1 3.3 18 2 2.5 16.5 3 1.8 14.5 4 4.2 22 5 2.9 19 Table 1. Campaign data (A) Find the average advert cost and the average sales income for all of the five years. (B) Imagine that you were rather given the table below instead of the table with the data. Write down exactly the same expression you started the calculation with in (A), except that you now use symbols instead of numbers. Year number Advert cost Sales income 1 X 1 Y 1 2 X 2 Y 2 3 X 3 Y 3 4 X 4 Y 4 5 X 5 Y 5 Table 2. Campaign data (C) Show that the expression you found in (B) for the average advert cost is the same as X = 1 5 Xi. (D) Find the profit for each year (sales income minus advert cost your boss uses slaves, so he doesn t have any other expenses). Using these numbers, calculate the mean profit for all years. Explain why you could also find the mean profit for all five years by Ȳ X and verify that this indeed becomes the same number. (E) The boss would now like to know the summed income of all the five years, but he wants to hear it directly in pounds, and not in thousands of pounds. Why can you just multiply the number you calculated in (A) by 5000 to find this? (F) How much does the advertising cost and sales income vary? Find the variance of the yearly advertising costs and sales incomes. (G) Find the correlation between advertising cost and sales income. How would you interpret this value?
8 STEFFEN GRØNNEBERG (H) Draw a plot of the data by hand and draw the straight line that you think fits the best to the observations. (I) Your boss is quite simply not satisfied with the line you found, and requires you to find the OLS-line given by ^Y = a + bx where b = r s y, a = Ȳ b X. s x Draw this line on the plot with the data and your manually drawn line. Assignment 7. In this assignment we will show that the following rules are obeyed by the summation-sign. i. For a number a and numbers X 1, X 2,..., X n, we have (ax i ) = a X i. That is, one can move constants within the summation sign to the outside. Here, constants means numbers that do not change with i. ii. For numbers X 1, X 2,..., X n and Y 1, Y 2,..., Y n we have that (X i + Y i ) = ( X i ) + ( Y i ). This is a way of splitting (and also combining!) summation signs. iii. For two numbers a and b, and numbers X 1, X 2,..., X n and Y 1, Y 2,..., Y n, we have that (ax i + by i ) = a X i + b Y i. This is a combination of the two rules above. To illustrate what I want you to do, I will show the first rule here. We know that (axi ) = ax 1 + ax 2 + + ax n. By the basic rules of algebra, we can factorize a outsize of the sum, which gives ax 1 + ax 2 + + ax n = a (X 1 + X 2 +... + X n ). As X 1 + X 2 +... + X n can be written as X i, we conclude that (axi ) = a X i. (A) Show rule (ii), but not with more details than my illustration above. (B) Show rule (iii), but not with more details than my illustration above. Assignment 8. Go through the notes of the fifth lecture, and make sure you understand all the steps when we showed that and s x,y = s y,x. r = s x,y s x s y 3. Log, exp and their relations An good introduction to the exponential function and the logarithm can be found on the book-chapter from Calculus by Freilich & Greenleaf uploaded to Its Learning. Assignment 9. (A) 2 x = 8 (B) 2 x = 32 (C) 4 x = 256 (D) 3 5x 10 = 1 (E) e x = 10 Find (solve for) x for each of the following equations.
MATH CRASH COURSE GRA6020 SPRING 2012 9 (F) e 2x = 20 (G) log x = 1 (H) log 3x = 5 (I) 4 1/x = 2 (J) (4.5) 2x = 18 Assignment 10. A researcher is interested in how the variables GRE (Graduate Record Exam scores), GPA (Grade Point Average) effect admission into a certain graduate school. The response variable, admit/don t admit, is a 0/1-variable, where 1 signifies admission. A logistic regression model was fitted in SPSS using GRE and GPA as covariates. Figure 1 gives part of the SPSS output. Figure 1. SPSS output for the reduced model (A) Find the estimated probability for an individual with GRE = 400 and GPA = 2.6 of getting admission into the graduate school. (B) Find the estimated probability for an individual with GRE = 200 and GPA = 2.6 of getting admission into the graduate school. (C) The best possible GPA is 4.0. Based on the fitted model, is it possible to have at least a 50% admission probability when GRE = 200? (D) How large must GRE be to be able to get at least a 50% admission probability with the best possible GPA? Department of Economics, BI Norwegian School of Management, Nydalsveien 37, Oslo, Norway 0484, Norway E-mail address: Steffen.Gronneberg@bi.no