MATH10001 Mathematical Workshop Graph Fitting Project part 2 Polynomial models Modelling a set of data with a polynomial curve can be convenient because polynomial functions are particularly easy to differentiate and integrate. Theorem (polynomial interpolation) Given a collection of data points (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) where x i x j for all i j, there exists a unique polynomial p(x) of degree at most n 1 such that p(x i ) = y i for all i = 1,..., n. This means that the graph of y = p(x) passes through all the data points. Even though this creates a perfectly fitting model for our data points the method does have its limitations. Example Find the quadratic function that passes through the points (1, 2), (2, 4) and (3, 5). Let p(x) = ax 2 + bx + c be the required quadratic. We have a(1 2 ) + b(1) + c = 2 a(2 2 ) + b(2) + c = 4 a(3 2 ) + b(3) + c = 5. Solving this system of simultaneous equations we get a = 0.5, b = 3.5 and c = 1 and so the required polynomial is p(x) = 0.5x 2 + 3.5x 1. 1
6 5 4 3 2 1 0 1 0 0.5 1 1.5 2 2.5 3 3.5 4 This model exactly fits the three data points. However if we are given some extra data points they may not necessarily lie on this curve and we would have to calculate a higher order polynomial to pass through all the points. This calculation would involve solving a larger system of simultaneous equations. Even though the model exactly fits the data points it may give meaningless answers for points between the data points. Example The table below lists the area of seven countries and the total length of railway track: x = Area (1000s square miles) y = Railway (1000s miles) 1.4 2.7 2.4 2.27 7.1 3.31 13.8 3.39 34.2 3.81 109.3 4.88 134 4.62 If we calculate the interpolation polynomial for this data we get a degree 6 polynomial p(x). If we take x = 50, 000 square miles, the model gives 2
y = 153320 miles! This shows there are problems with the model even though it fits the data perfectly. Another approach is to use the method of least squares. This would involve finding a polynomial p(x) that minimizes the expression S = (y i p(x i )) 2. The advantage of this approach is that we can do the minimization for different degree polynomials and choose the best model. Let p(x) = ax 2 + bx + c and so S = (y i ax 2 i bx i c) 2. We need to solve This gives S a = S b = S c = 0. 2x 2 i (y i ax 2 i bx i c) = 0 2x i (y i ax 2 i bx i c) = 0 2(y i ax 2 i bx i c) = 0. Using the values of x and y above and solving we get a = 0.0002, b = 0.0475, c = 2.6062 and so our model is p(x) = 0.0002x 2 + 0.0475x + 2.6062. 3
5.5 5 4.5 4 3.5 3 2.5 2 0 20 40 60 80 100 120 140 4
MATH10001 Mathematical Workshop Graph Fitting Project part 2 Problem 3 (i) For the following data find a cubic polynomial model that exactly fits the data points. x 1 2 3 4 y 2 3 2 4 (ii) For the same data produce a quadratic model using the least squares method. (iii) Plot both models together with the data points (you can use MATLAB to do this). Give a reason why the least squares quadratic may be a better model that the interpolation polynomial. Problem 4 For a cubic model using the least squares method we need to minimize S = [y i (ax 3 i + bx 2 i + cx i + d)] 2. This involves solving S a = S b = S c = S d = 0. Determine these partial derivatives and find the least squares cubic model for the following data. x 0 1 2 3 4 y 2 0 1 1 1 5
Project Report The assessment for this project is by an individual project report. The report should contain your solutions to the four problems and the homework from part 1. Your report should be well presented and the problem solutions should be clearly explained. Even though you have worked as a group on this project, the report should be all your own work. There are marks for the clarity as well as the correctness of your mathematical arguments. Please hand in your report to your postgraduate facilitator at the start of your Workshop class in week 12 (week commencing 12th December). You should attach a cover sheet to your report and put your group number on the front page. There are 30 marks for this project. (a) 20 marks for the solutions to the problems. (b) 3 marks for the homework. (c) 2 marks for the presentation of your report (clear explanations and layout). (d) 1/4 of the average mark for your group for (a) out of 5. Any student who does not attend one group session, without good reason, will get half the marks for (d). If a student misses both group sessions, without good reason, they will get 0 marks for (d). Please notify the School as soon as possible if you miss a session and fill in a Self Certification form available from the Alan Turing Building reception. 6