Ratio of Polynomials Search One Variable

Similar documents
Ratio of Polynomials Fit One Variable

Ratio of Polynomials Fit Many Variables

Fractional Polynomial Regression

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.

Hotelling s One- Sample T2

Passing-Bablok Regression for Method Comparison

Tests for Two Coefficient Alphas

1 Introduction to Minitab

Mixed Models No Repeated Measures

Analysis of 2x2 Cross-Over Designs using T-Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Using Tables and Graphing Calculators in Math 11

Analysis of Covariance (ANCOVA) with Two Groups

Retrieve and Open the Data

Hot Spot / Point Density Analysis: Kernel Smoothing

Solving Polynomial and Rational Inequalities Algebraically. Approximating Solutions to Inequalities Graphically

Intermediate Algebra Summary - Part I

Ridge Regression. Chapter 335. Introduction. Multicollinearity. Effects of Multicollinearity. Sources of Multicollinearity

One-Way Analysis of Covariance (ANCOVA)

Advanced Quantitative Data Analysis

Logarithms Dr. Laura J. Pyzdrowski

Patrick: An Introduction to Medicinal Chemistry 5e MOLECULAR MODELLING EXERCISES CHAPTER 17

Number of solutions of a system

85. Geo Processing Mineral Liberation Data

3.3 Real Zeros of Polynomial Functions

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test)

Hot Spot / Kernel Density Analysis: Calculating the Change in Uganda Conflict Zones

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

85. Geo Processing Mineral Liberation Data

PASS Sample Size Software. Poisson Regression

Standards-Based Quantification in DTSA-II Part II

Geography 281 Map Making with GIS Project Four: Comparing Classification Methods

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X

Using Microsoft Excel

Lesson 5b Solving Quadratic Equations

Introduction to Computer Tools and Uncertainties

Experiment 1 Introduction to 191 Lab

through any three given points if and only if these points are not collinear.

There are four irrational roots with approximate values of

How to Perform a Site Based Plant Search

2010 Autodesk, Inc. All rights reserved. NOT FOR DISTRIBUTION.

Better Exponential Curve Fitting Using Excel

Two problems to be solved. Example Use of SITATION. Here is the main menu. First step. Now. To load the data.

LOOKING FOR RELATIONSHIPS

General Linear Models (GLM) for Fixed Factors

Calculating Conflict Density and Change over Time in Uganda using Vector Techniques

Review Solutions, Exam 2, Operations Research

Algebra I Polynomials

3 Charged Particle Motion in a Magnetic Field

Using SPSS for One Way Analysis of Variance

Algebra II Polynomials: Operations and Functions

Degree of a polynomial

Spatial Data Analysis in Archaeology Anthropology 589b. Kriging Artifact Density Surfaces in ArcGIS

Correlation 1. December 4, HMS, 2017, v1.1

CLEA/VIREO PHOTOMETRY OF THE PLEIADES

MERGING (MERGE / MOSAIC) GEOSPATIAL DATA

September 12, Math Analysis Ch 1 Review Solutions. #1. 8x + 10 = 4x 30 4x 4x 4x + 10 = x = x = 10.

Advanced Forecast. For MAX TM. Users Manual

Infinity Unit 2: Chaos! Dynamical Systems

Astron 104 Laboratory #6 The Speed of Light and the Moons of Jupiter

Systems of Equations and Inequalities. College Algebra

Advanced Regression Topics: Violation of Assumptions

Lab 1 Uniform Motion - Graphing and Analyzing Motion

2 Prediction and Analysis of Variance

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Copy the rules into MathLook for a better view. Close MathLook after observing the equations.

Topic 1. Definitions

OECD QSAR Toolbox v.4.1. Tutorial of how to use Automated workflow for ecotoxicological prediction

Using SkyTools to log Texas 45 list objects

Group-Sequential Tests for One Proportion in a Fleming Design

6.5 Dividing Polynomials

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

Polynomials; Add/Subtract

Irrational Thoughts. Aim. Equipment. Irrational Investigation: Teacher Notes

6x 3 12x 2 7x 2 +16x 7x 2 +14x 2x 4

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Confidence Intervals for the Interaction Odds Ratio in Logistic Regression with Two Binary X s

CHAPTER 10. Regression and Correlation

Motion II. Goals and Introduction

Boyle s Law and Charles Law Activity

Learning Packet. Lesson 5b Solving Quadratic Equations THIS BOX FOR INSTRUCTOR GRADING USE ONLY

Basic chromatographic parameters and optimization in LC

Algebra I. Polynomials.

UNIT 2B QUADRATICS II

5:1LEC - BETWEEN-S FACTORIAL ANOVA

Partial Fraction Decomposition

Confidence Intervals for One-Way Repeated Measures Contrasts

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

Collisions and conservation laws

Newton's 2 nd Law. . Your end results should only be interms of m

Practice Test - Chapter 2

7x 5 x 2 x + 2. = 7x 5. (x + 1)(x 2). 4 x

General Recipe for Constant-Coefficient Equations

arxiv: v1 [hep-ph] 5 Sep 2017

Geographical Information Systems

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Practical Algebra. A Step-by-step Approach. Brought to you by Softmath, producers of Algebrator Software

Multiple Regression Basic

Computer simulation of radioactive decay

Core Mathematics 2 Algebra

Transcription:

Chapter 370 Ratio of Polynomials Search One Variable Introduction This procedure searches through hundreds of potential curves looking for the model that fits your data the best. The procedure is heuristic in nature, but seems to do well with the data we have tried. A general class of models called the ratio of polynomials (see the previous chapter) provides a wide variety of curves to search from. Normally, fitting these models is a slow, iterative process. However, using a shortcut, an approximate solution may be found very quickly so that a large number of models may be searched in a short period of time. After the best fitting model is found, use the procedure discussed in the Ratio of Polynomials Fit chapter to provide a detailed analysis of it. For each model, various transformations of X and Y can be tried. This expands the number of models that may be tried to several hundred. The general ratio of polynomials model fit is 2 3 4 5 A 0+ A1 f(x)+ A2 f (X)+ A3 f (X)+ A4 f (X)+ A5 f (X) g(y)= 1+ B f(x)+ B f (X)+ B f (X)+ B f (X)+ B f (X) + e. 2 3 4 5 1 2 3 4 5 Here g(y) and f(x) represent power transformations of Y and X such as LOG(X), SQRT(X), etc. The parameters A0, A1, A2,..., B5 are constants that are estimated from the data. The value e represents the error or residual of that observation. By setting some constants to zero, various simplified models are obtained. For example, if only A0 and A1 are nonzero, the familiar linear model, Y=A0+A1X+e, is obtained. A Shortcut Consider the simple model Y = A 0 + A 1 X 1 + B1 X + e. If you ignore e (set it to zero for a moment) and multiply both sides of this equation by (1+B1X) you will get Y+B1XY=A0+A1X. Now if you subtract B1XY from both sides you will get Y=A0+A1X-B1XY. Finally, if you relabel XY as Z you get Y=A+BX+CZ. 370-1

Note that the variable Z is a direct transformation of X and Y. This last equation is in standard linear form. The parameters A, B, and C may be estimated using standard multiple regression! Note that the parameter B1 in our original equation is equal to -C in the final equation. One catch in using this procedure is that you have to assume the e to be zero. When the model fits well, the e will be near zero. When the model does not fit well, these e will be relatively large and our method breaks down. However, the large e will warn us that the model has not fit well. Parsimony One of the main principles in model building is that you should never use three parameters when two parameters will do. Hence, one of our tasks will be to find a model with the fewest number of parameters. A second principle in dealing with the ratio-of-polynomials model is that you should not fit a model with a numerator of higher polynomial order than that of the denominator. The models tried by this program follow these rules. A third rule is that all terms in a polynomial up to the desired order must be included. Hence, you would not use Y=A+CX 2. Instead you would fit Y=A+BX+CX 2. The program tries the five models having a fifth-order polynomial in the denominator. The numerator polynomials are A0+A1X, A0+A1X+A2X 2,..., A0+A1X+A2X 2 +A3X 3 +A4X 4 +A5X 5. Next the four models having a fourth-order polynomial denominator are tried. This continues on down to the simple equation Y=(A0+A1X)/(1+B1X). This process is repeated for each combination of transformations that are specified for Y and X. Goodness-of-Fit The final issue measuring of how well a given model fits the data so that the various models can be compared. This is tough since the goodness-of-fit statistics you are familiar with (like R 2 ) do not have the same meaning in this setting. However, because of the lack of other general, goodness-of-fit indices, we have chosen to base our selection on the value of R 2. We justify this because this procedure is only an intermediate step in the modeling process. You must take several steps before making your final model selection. Assumptions and Limitations Usually, nonlinear regression is used to estimate the parameters in a nonlinear model without performing hypothesis tests. In this case, the usual assumption about the normality of the residuals is not needed. Instead, the main assumption needed is that the data may be well represented by the model. Data Structure The data are entered in two variables: one dependent variable and one independent variable. Missing Values Rows with missing values in the variables being analyzed are ignored in the calculations. When only the value of the dependent variable is missing, predicted values are generated. 370-2

Procedure Options This section describes the options available in this procedure. Variables Tab This panel specifies the variables used in the analysis. Y (Dependent) Variable Variable Specifies a single dependent (Y) variable from the current database. This is the variable being predicted. Y (Dependent) Variable Select Y Transformations 1/Y^2, 1/Y, 1/SQRT(Y), LN(Y), SQRT(Y), Y, Y^2 Specifies whether this transformation of Y should be searched. X (Independent) Variable Variable Specifies a single independent (X) variable. This is the variable used to predict Y. X (Independent) Variable Select X Transformations 1/X^2, 1/X, 1/SQRT(X), LN(X), SQRT(X), X, X^2 Specifies whether this transformation of X should be searched. Zero Cutoff Zero This is the value used as zero by the algorithm. Because of rounding error, values lower than this value are reset to zero. If unexpected results are obtained, you might try using a smaller value, such as 1E-16. Note that 1E-5 is an abbreviation for the number 0.00001. Reports Tab The following options control the reports and plots output. Select Reports Models Reported This option limits the number of models that are reported on. For example, if you select 20 here, then the report shows the 20 best models. 370-3

Report Options Precision Specify the precision of numbers in the report. Single precision will display seven-place accuracy, while the double precision will display thirteen-place accuracy. Note that all reports are formatted for single precision only. Variable Names Specify whether to use variable names or (the longer) variable labels in report headings. Plots Tab This section controls the plot(s) showing the data with the fitted function line. Click the plot format button to change the plot settings. Select Plots Function Plot with Actual Y Specifies whether the plot in the actual scale of Y and X should be displayed. Function Plot with Transformed Y Specifies whether the plot in the transformed scale of Y and X should be displayed. Models Plotted This option specifies how many of the best models are plotted. Example 1 Searching for the Best Ratio of Polynomials Model This section presents an example of how to search for the best fitting ratio of polynomials model. In this example, we will search for the best fitting model using the variables Y and X of the FnReg3 dataset. We will also consider the log transformation of each variable in our search. You may follow along here by making the appropriate entries or load the completed template Example 1 by clicking on Open Example Template from the File menu of the window. 1 Open the FnReg3 dataset. From the File menu of the NCSS Data window, select Open Example Data. Click on the file FnReg3.NCSS. Click Open. 2 Open the window. Using the Analysis menu or the Procedure Navigator, find and select the Ratio of Polynomials Search One Variable procedure. On the menus, select File, then New Template. This will fill the procedure with the default template. 3 Specify the variables. On the window, select the Variables tab. Double-click in the Y Variable box. This will bring up the variable selection window. Select Y from the list of variables and then click Ok. Double-click in the X Variable box. This will bring up the variable selection window. Select X from the list of variables and then click Ok. 370-4

4 Run the procedure. From the Run menu, select Run Procedure. Alternatively, just click the green Run button. Search Summary Section Search Summary Section Model Current Best Percent No. F(Y) F(X) Model R-Squared R-Squared of Best 1 y x 3 / 4 0.990115 0.990115 100.00 2 y x 1 / 4 0.989091 0.990115 99.90 3 y x 2 / 4 0.989078 0.990115 99.90 4 LN(y) x 3 / 4 0.988342 0.990115 99.82 5 y x 0 / 5 0.987972 0.990115 99.78 6 y x 4 / 4 0.984444 0.990115 99.43 7 LN(y) x 0 / 5 0.984241 0.990115 99.41 8 LN(y) x 2 / 4 0.984015 0.990115 99.38 9 LN(y) x 1 / 4 0.983513 0.990115 99.33 10 y x 0 / 4 0.983109 0.990115 99.29 11 LN(y) x 0 / 4 0.977900 0.990115 98.77 12 LN(y) LN(x) 4 / 5 0.975901 0.990115 98.56 13 y x 1 / 5 0.975564 0.990115 98.53 14 LN(y) x 5 / 0 0.974421 0.990115 98.41 15 y x 2 / 5 0.972910 0.990115 98.26 16 LN(y) x 1 / 5 0.970396 0.990115 98.01 17 LN(y) x 4 / 0 0.967638 0.990115 97.73 18 y LN(x) 4 / 5 0.956059 0.990115 96.56 19 y x 5 / 0 0.948378 0.990115 95.78 20 LN(y) LN(x) 3 / 3 0.945165 0.990115 95.46 This report displays a separate line for each model tried. Note that the results have been sorted by R-Squared so that the best model is displayed at the top. For this example, the best model is the ratio of a third order numerator polynomial and a fourth order denominator polynomial, with no transformations of Y or X needed. We would now fit this model using the Ratio of Polynomial Fit procedure. Model No. The ranking of the model displayed on this line. F(Y) The transformation (if any) applied to the Y (dependent) variable. F(X) The transformation (if any) applied to the X (independent) variable. Model The ratio of polynomial model whose results are displayed on this row. The syntax of the model statement is N/D where N represents the order of the numerator polynomial and D represents the order of the denominator polynomial. If N or D is set to zero, that polynomial is ignored. For example, the model 1/2 means A0+A1X in the numerator and 1+B1X+B2X^2 in the denominator. 370-5

Current R-Squared The value of pseudo R-Squared for this model and transformations. There is no direct R-Squared defined for nonlinear regression. This is a pseudo R-Squared constructed to approximate the usual R-Squared value used in multiple regression. We use the following generalization of the usual R-Squared formula: R-Squared = (ModelSS - MeanSS)/(TotalSS-MeanSS) where MeanSS is the sum of squares due to the mean, ModelSS is the sum of squares due to the model, and TotalSS is the total (uncorrected) sum of squares of Y (the dependent variable). This version of R-Squared tells you how well the model performs after removing the influence of the mean of Y. Since many nonlinear models do not explicitly include a parameter for the mean of Y, this R-Squared may be negative (in which case we set it to zero) or difficult to interpret. However, if you think of it as a direct extension of the R-Squared that you use in multiple regression, it will serve well for comparative purposes. Best R-Squared The pseudo R-Squared of the first (best) model. Percent of Best The percent that the pseudo R-Squared of this model is of the overall best model. Often you will be able to find models that are nearly as good as the best model, but have many fewer parameters. Function Plots 370-6

These plots show the best few models plotted in the original (on the left) and transformed (on the right) scales. They will help you determine which model (or models) you want to evaluate further using the Ratio of Polynomial Fit procedure. 370-7