Tutoril VII: Liner Regression Lst updted 5/8/06 b G.G. Botte Deprtment of Chemicl Engineering ChE-0: Approches to Chemicl Engineering Problem Solving MATLAB Tutoril VII Liner Regression Using Lest Squre Method (lst updted 5/8/06 b GGB) Objectives: These tutorils re designed to show the introductor elements for n of the topics discussed. In lmost ll cses there re other ws to ccomplish the sme objective or higher level fetures tht cn be dded to the commnds below. An text below ppering fter the double prompt (>>) cn be entered in the Commnd Window directl or in n m-file. The following topics re covered in this tutoril; Introduction Procedure to perform liner regression in Mtlb Solved Problem using Mtlb (guided tour) Solved Problem using Excel (guided tour) Introduction: Regression of dt consists of getting mthemticl expression tht best fits ll the dt. Tht is given set of experimentl dt in which the dependent vrible is function of x the intention of regression is to determine nd expression for: = f( x) () For exmple set of experimentl dt could be predicted b using the following expression: = x + b () The objective of regression is to determine the vlues for the prmeters nd b. Notice tht in this cse the unknowns-the vribles to clculte- re nd b. Becuse the unknown vribles (coefficients) re liner the determintion of the coefficients is known s Liner Regression. There re different methods to perform liner regression the most common one is known s Lest Squre Method. As shown on the digrm below the lest squres method minimizes the sum of the squred distnces between the points nd the fitted line. Procedure to Perform Liner Regression in Mtlb: x
The objective is to determine the m prmeters ' ' ' ' nd ' ' etc. in the eqution = x + x +. m given set of n dt points (x x x m- ). This is done b writing out the eqution for ech dt point. This results in set of n equtions in m unknowns m x + x + x +. m = x + x + x +. x + x + x +. : x + x + x +. m m n n n m n where the first subscript on x identifies the independent vrible nd the second subscript signifies the dt point In mtrix nottion this is expressed s; Tht is UNKNOWNS x x x x x x x x x : x x x = : : n n n m n [ x]{ } = { } () In order to perform liner regression in Mtlb the objective is to determine the vector {} from Eq. (). This is done b using the formul given below: {} []\{} x (4) Notice tht wht is clled mtrix [x] ws built b combining ech of the individul independent vrible column vectors {x } {x } {x } nd unit column vector (vector which constituents re ll ) s shown in the schemtic representtion given below:
Tutoril VII: Liner Regression Lst updted 5/8/06 b G.G. Botte x + x + x +. m x + x + x +. m x + x + x +. m : x + x + x +. n n n m n x x x x x x x x x : x x x = : : n n n m n Unit vector The procedure to perform liner regression in Mtlb is summrized below:. Input the experimentl dt in the mfile. For exmple input the vectors {} {x } {x } {x } etc. Mke sure tht the vectors re column vectors. If ou input the vectors s row vectors use the trnspose (See Tutoril III p.6). Crete the unit column vector. Crete the mtrix [x] b combining ech of the individul column vectors nd the unit vector (See Tutoril III p. 4) 4. Appl Eq. (4) to clculte the coefficient vector {}. These re the prmeters for our eqution. 5. Determine how good is our fit b:. Clculte the predicted vlue b. Clculte the difference between the predicted vlue nd the experimentl vlue c. Mke tble tht shows the differences (experimentl dt predicted vlue nd difference between experimentl dt nd predicted vlue) d. Plot the experimentl dt (using plot see Tutoril V.b) e. Plot the eqution (using fplot see Tutoril V.b)
Solved Problem using Mtlb: Develop liner correltion to predict the finl weight of n niml bsed on the initil weight nd the mount of feed eten. (finl weight) = (initil weight) + *(feed eten) + The following dt re given: finl weight (lb) initil weight (lb) feed eten (lb) 95 4 7 77 6 80 59 00 45 9 97 9 70 6 8 50 7 80 4 6 9 40 0 84 8 5 Solution: The mfile is shown below: % This progrm shows n exmple of liner regression in Mtlb % Developed b Gerrdine Botte % Creted on: 05/8/06 % Lst modified on: 05/8/06 % Che-0 Spring 06 % Solution to Solved Problem Tutoril VII % The progrm clcultes the best fit prmeters for correltion % representing the finl weight of nimls given the initil weight % nd the mount of food eten: % fw=*initwgt+*feed+ %-------------------------------- cler; clc; fprintf('this progrm shows n exmple of liner regression in Mtlb\n'); fprintf('developed b Gerrdine Botte\n'); fprintf('creted on: 05/8/06\n'); fprintf('lst modified on: 05/8/06\n'); fprintf('che-0 Spring 06\n'); fprintf('solution to Solved Problem Tutoril VII\n'); fprintf('the progrm clcultes the best fit prmeters for correltion\n'); fprintf('representing the finl weight of nimls given the initil weight\n'); fprintf('nd the mount of food eten\n'); 4
Tutoril VII: Liner Regression fprintf('fw=*initwgt+*feed+\n'); Lst updted 5/8/06 b G.G. Botte %Step of Procedure (see p. TVII): input the dt into vectors. initwgt = [ 4 45 9 6 4 40 8]; % in lbs. (independent vrible) feed = [ 7 6 59 9 8 7 6 0 5]; % in lbs. (independent vrible) fw = [95; 77; 80; 00; 97; 70; 50; 80; 9; 84]; % in lbs (dependent vrible). %becuse the dt is given s row vectors it needs to be trnsformed into column vectors initwgt=initwgt'; feed=feed'; %Step of Procedure (see p. TVII): Crete the unit column vector for i=:0 unit(i)=; end unit=unit'; %Step of Procedure (see p. TVII):Crete the mtrix [x] b combining ech of % the individul column vectors nd the unit vector (See Tutoril III p. 4) x=[initwgt feed unit]; %Step of Procedure (see p. TVII):4. Appl Eq. (4) to clculte the coefficient % vector {}. These re the prmeters for our eqution. =x\fw; %Mke sure to bring ll the vectors bck into row vectors so tht ou cn use for loops %for printingnd performing vector opertions %printing the prmeters initwgt=initwgt'; feed=feed'; ='; fw=fw'; fprintf('the coefficients for the regression re\n'); for i=: fprintf('(%i)= %4.f\n' i (i)); end %ou cn lso print the eqution b using fprintf: fprintf('fw = %4.f * initwgt + %4.f * feed + %4.f\n' () () ()); %Clculting the numbers predicted b the eqution nd the difference for i=:0 fwp(i)=initwgt(i)*()+feed(i)*()+(); %This is the predicted finl weight lbs dev(i)=fw(i)-fwp(i); %this is the devition lbs end %Mking the comprison tble: 5
fprintf(' \n'); fprintf('experimentl finl weight predicted finl weight devition\n'); fprintf(' lbs lbs lbs\n'); fprintf(' \n'); for i=:0 fprintf(' %5.f %5.f %5.f\n' fw(i) fwp(i) dev(i)); end fprintf(' \n'); This is wht ou will see on screen: Procedure to perform liner regression in Excel: Excel cn do single or multiple liner regression through the dt nlsis toolbox. This toolbox needs to be dded s n dd in. To illustrte how to perform liner regression in Excel let us solve the sme problem:. Write our dt into n Excel spredsheet s shown below: 6
Tutoril VII: Liner Regression Lst updted 5/8/06 b G.G. Botte. Lod the dt nlsis toolbox : Click on Anlsis ToolPk Press OK 7
. Go to Dt Anlsis nd find the Regression tool: 4. Click OK nd ou will be prompted to the Regression nlsis: Select the Y rnge Select the rnge where the independent vribles re simultneousl 8
Tutoril VII: Liner Regression 5. Mke the dditionl selections nd press OK Lst updted 5/8/06 b G.G. Botte 6. This is wht ou will see on the screen: SUMMARY OUTPUT Regression Sttistics Multiple R 0.9449 R Squre 0.875798 Adjusted R Squre 0.86974 Stndrd Error 6.05078864 Observtions 0 The closer this vlue is to the better the fit is ANOVA df SS MS F ignificnce F Regression 764.5698 88.078 4.098 0.00077 Residul 7 56.840 6.604 Totl 9 00.5 Coefficients Stndrd Error t Stt P-vlue Lower 95%Upper 95%Lower 95.0%Upper 95.0% Intercept -.9964 7.7654 -.94475 0.6565-64.9949 9.00858-64.9949 9.00858 X Vrible.95679 0.585466.9584 0.047758 0.088.7765 0.088.7765 X Vrible 0.764 0.05776696.76709 0.0070 0.0806 0.54 0.0806 0.54 RESIDUAL OUTPUT Fitting prmeters Observtion Predicted Y Residuls 94.85945 0.840548 7.4467 4.7557774 79.45946 0.5740855 4 0.55 -.5506 5 99.5849 -.584997 6 67.07445.9568558 7 59.54888-9.548875 8 85.586896-5.586896 9 8.88486 9.5676 0 8.85575.884454 Difference between experimentl nd predicted vlue Predicted vlues 7. You will lern how to interpret more of the sttisticl results in the Experimentl Design Course. 9