Analyzing and Interpreting Continuous Data Using JMP

Analyzing and Interpreting Continuous Data Using JMP A Step-by-Step Guide José G. Ramírez, Ph.D. Brenda S. Ramírez, M.S. Corrections to first printing.

The correct bibliographic citation for this manual is as follows: Ramírez, José G., and Brenda S. Ramírez. 2009. Analyzing and Interpreting Continuous Data Using JMP : A Step-by-Step Guide. Cary, NC: SAS Institute Inc. Analyzing and Interpreting Continuous Data Using JMP : A Step-by-Step Guide Copyright 2009, SAS Institute Inc., Cary, NC, USA ISBN 978-1-59994-488-3 All rights reserved. Produced in the United States of America. For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st printing, August 2009 2nd printing, October 2010 SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hardcopy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.

Chapter 2: Overview of Statistical Concepts and Ideas 55 Figure 2.17b Individual Measurements Chart for Resistance Data In this section we wanted to provide a quick introduction to some useful graphs and summary statistics. In Chapter 3 we will show how to use them to characterize the measured performance of a material, process, or product, and in subsequent chapters we will continue to use them, as appropriate. Every data analysis and formal tests of significance should include a thorough exploratory data analysis (EDA) using the summary statistics and graphical displays discussed in this section. Apart from helping us identify unusual or outlying observations, EDA can help answer questions of interest, and it is a great springboard for suggesting hypotheses that can be explored with further data analysis techniques. 2.5 Quantifying Uncertainty: Common Probability Distributions In Section 2.3 we discussed the need for considering the different aspects that make for a good study. These include the following: clearly defining the reasons for our study being detailed about our population of interest from which a sample will be selected

Chapter 2: Overview of Statistical Concepts and Ideas 63 Figure 2.19a Selecting the Formula Editor and the Normal Distribution Once the Normal Distribution() is in the Formula Editor just click in the red box and type -2.48. Once you click OK the answer appears as 0.00656912 or 0.66% (Figure 2.19b). Of course, you can just type the expression (45-49.86) / 1.961 inside the red box and JMP will calculate the Z-score before computing the area. Figure 2.19b Normal Distribution for -2.48 and Probability Result

68 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide 2.6.3 Tolerance Interval to Contain a Given Proportion p of the Sampled Population A tolerance interval to contain a given proportion p of the sampled population, with a given confidence (1-α) is given by this formula: X ± g s (1 α / 2, pn, ) This is similar to the confidence interval for the mean, but the t-quantile is replaced by the function g, which is a function of the degree of confidence, (1 α), the proportion, p, that the interval is going to contain, and the sample size n. In other words, the marginof-error M.O.E. = g s. The function g (1 α/2, p, n) (1 α/2, p, n) is somewhat complicated for hand calculations (although tables are available), so it is better to let JMP evaluate it for us. Prediction and tolerance intervals provide ways to characterize the output performance of a process by giving us a window where future measured performance values will fall, or a window where a large proportion (e.g., 95%, 99%) of the measured performance values will fall. Luckily for us the calculations for these statistical intervals are readily available in JMP through the Distribution Platform. In Chapter 3, we will show you how to calculate these intervals in the context of a problem. However, to give you a feel for them Table 2.6a shows four statistical intervals for the Resistance data. Table 2.6a Statistical Intervals for Resistance Data Interval Type Confidence interval for µ, center of the sampled population Confidence interval for σ, spread around center of the population Prediction interval for one future observation Tolerance interval containing 99% of population Confidence Level Lower Limit (Ohms) Upper Limit (Ohms) 95% 49.23 50.49 95% 1.61 2.52 95% 45.84 53.88 95% 43.55 56.16

Chapter 3: Characterizing Measured Performance 89 the one given by X ± 0.67s =[ 0.9894 0.67 1.4565401; 0.9894 + 0.67 1.4565401] = [ 1.972; 0.007] to see that it is close to the 5-Number summary estimate. Figure 3.1a Basic Descriptive Statistics The process capability index, C pk, is a measure of process capability that standardizes the distance from the mean of the process to the closest specification limit, by three times an estimate of the standard deviation (variation). In other words, the C pk is one-third the distance to the nearest specification (DNS). The farther the process average is from the specifications limits in standard deviation units, the larger the DNS and the C pk, and the more elbow room the process has to cope with upsets. For a given set of specifications, think of the specification window (LSL USL) as the width of a garage door and the process output (its distribution) as a car we are trying to park. A C pk = 1 ( DNS=3) indicates that the width of the car is almost the width of the garage; any deviation from the center will scratch the sides of the car! It is desirable then, to have large values of C pk. A C pk 1.5 is considered good, in the sense that the car has enough room on each side and it will not get scratched even if we deviate a little from the center of the garage. In other words, a C pk 1.5 enables the process enough room to shift before making lots of out-ofspecification material.

Chapter 4: Comparing Measured Performance to a Standard 157 standard, or greater than the standard. In statistical jargon these types of comparisons (equal, greater than, or less than) can be described using a null (H 0 ) and an alternative (H 1 ) statistical hypothesis that contain statements about the average performance (population mean, µ) as compared with the standard k. Statistics Note 4.1: Tests of significance start with a null hypothesis (H 0 ) that represents our current beliefs that we assume to be true but cannot be proven, and an alternative hypothesis (H 1 ) that is a statement of what the statistical test of significance is set out to prove by rejecting the null hypothesis. In what follows we will label the null hypothesis Assume, and the alternative hypothesis Prove to remind us of what the test of significance sets out to do. The three ways that we can write the null (H 0 ) and alternative (H 1 ) hypothesis are as follows: 1. The average performance is different from the standard. H 0 : Average Performance (µ) = Standard (k) H 1 : Average Performance (µ) Standard (k) (Assume) (Prove) This is a two-sided alternative since we do not care about the average performance being less or greater than the standard, just that it is different from it. The null hypothesis (H 0 ) assumes that the average performance (population mean, µ) is equal to our standard value, k. On the other hand, the alternative hypothesis (H 1 ) states that our average performance is not equal to our standard value, k. If our test of significance favors the alternative, then we have proven with statistical rigor that our average performance is different from the standard, k, without making any statements regarding the direction of departure from k. Two-sided alternative hypotheses do not make any claims about the direction of departure of the mean from our standard. 2. The average performance is greater than the standard. H 0 : Average Performance (µ) Standard (k) H 1 : Average Performance (µ) > Standard (k) (Assume) (Prove) In this one-sided alternative hypothesis, H 1, we are interested in knowing whether our average performance is greater than k. Here, a rejection of the null hypothesis H 0 will show, with statistical rigor, that the average

Chapter 4: Comparing Measured Performance to a Standard 197 Are the thickness averages homogeneous? As we discussed in Chapter 2, it is important to verify the homogeneity assumption because our statistical analysis of the data depends on this assumption (Wheeler 2005). Can we treat the thickness average data as coming from a homogeneous population? We can use an individuals and moving range chart (Graph > Control Chart > IR) to check the homogeneity of the thickness averages. The moving range for observation 35, Figure 4.22b, is outside the upper control limit (UCL), giving us a signal that the data may not be homogeneous. When we look at the individuals chart we notice that the last four averages, rows 35 to 38, are below the lower control limit (LCL) of 88.31 Å, which indicates that these averages are different from the rest. These four thickness averages correspond to the suspect run 10. In other words, the thickness averages are not homogenous, and the culprit seems to be run 10. Figure 4.22b Individual Measurements and Moving Range Chart of Oxide Thickness Data

Chapter 5: Comparing Two Measured Performances 225 any statements regarding the direction of departure in the two means. This is the most popular choice for many applications. A two-sided alternative hypothesis does not make any claims about the direction of departure of the mean from our standard. Sometimes, we want to prove with statistical rigor that the two population means are similar and not just assume it to be true. For example, many performance claims are stated in way to suggest that their performance is just as good as or no worse than a competitor s performance. This is equivalent to flipping the two hypotheses around and putting the equality statement in the alternative statement, H 1 : Average Performance (µ 1 ) = Average Performance (µ 2 ). This is referred to as an Equivalence Test and is discussed in Section 5.5. We assume that the average performances of the two populations are equal since we can never really prove it. 2. The Average performance of population 1 is greater than population 2 H 0 : Average Performance (µ 1 ) Average Performance (µ 2 ) (Assume) H 1 : Average Performance (µ 1 ) > Average Performance (µ 2 ) (Prove) In this one-sided alternative hypothesis, H 1, we are interested in knowing whether the average performance of population 1 is greater than the average performance of population 2. Here, a rejection of the null hypothesis, H 0, will show, with statistical rigor, that the average performance of population 1 is larger than the average performance of population 2. Proving the alternative hypothesis means that we have enough evidence (proof) to demonstrate, or establish, the validity of claims in the alternative hypothesis. The decision to write the hypothesis in this manner is application specific. For example, if we want to prove that a second entity (represented by µ 1 ) is superior in performance to the current (represented by µ 2 ) before we can use it, then we would want to set up an appropriate one-sided alternative hypothesis. For example, we might want to change suppliers for a liquid adhesive that we use in our products but only want to consider suppliers that have adhesives with a higher adhesion bond. This would lend itself to the following one-sided alternative hypothesis: H 1 : Avg. Bond Strength New Supplier > Avg. Bond Strength Existing Supplier.

378 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide Table 7.3 (continued) Model Term Name Description Least Squares Estimator β 1 Slope The slope is of primary interest because it represents the impact that the factor has on the response the rate of change. For every unit increase in X, Y increases (or decreases) by this many units. b = 1 n i= 1 n i= 1 ( x x) y i ( x x) i 2 i Model errors Considered to be a random variable representing the degree of error between the predicted and actual values for a given value of X. Assumed to be independent and normally distributed with mean = 0 and constant variance independent of the values of X. e = y yˆ i i i The last column in Table 7.3 provides the least squares equations for estimating the parameters in a simple linear regression model. What Is Least Squares? Least squares is an estimation method that seeks to minimize the sum of squares of the residuals (fitting errors), namely Σ(y i yt i ) 2. The best-fitting least squares line is the one that produces fitted values (yt i ) that minimize this sum of squares. Figure 7.2 shows a scatter plot of Y versus X for seven data points. In this plot the residuals (y i yt i ) are the lines projecting from each data point to the fitted line. The squares, defined by the residuals, represent the residuals squared (y i yt i ) 2, or each of the terms in the residuals sum of squares. The least squares line in the plot is the line that minimizes the sum of the areas of the seven squares. Figure 7.2 Least Squares Line and Residuals Squares

394 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide Figure 7.9 Einstein s Data: Constant Variance Check When the data exhibit variation that is not constant, we might see a funnel shape with the opening at either end of the x-axis. For illustrative purposes, a residuals plot showing heterogeneous variance is provided in Figure 7.10. In this plot we see more spread in the studentized residuals for smaller values of the predicted response Y than we do for larger values of the predicted response Y. In other words, the variation is decreasing as Y increases. This could happen if, for example, it is harder to measure the response at lower values. There are methods, such as weighted least squares estimation, that enable us to deal with non-constant variance but they are beyond the scope of this book (for examples see Sections 9.3 and 9.4 of Draper and Smith [1998]).

428 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide the load cell. The slope = 0.0007221 means that for each 1 pound increase in load, the deflection increases by 0.0007221 inches. Figure 7.28 Enhanced Output for Load-Defl ection Calibration Study

Chapter 7: Characterizing Linear Relationships between Two Variables 443 A window into the future: Prediction Intervals In addition to performing a thorough check of the model assumptions, including a lack-of-fit test, it is helpful to examine the quality of the predictions from our estimated quadratic model: Deflection = 0.0085144 + 0.0007221*Load 3.161E 9*(Load 1575) 2 (7.15) For a linear regression model with one factor, the Fit Y by X platform can be used to visualize the prediction error. The JMP steps and output for the Einstein data were previously presented in Section 7.3.4. These are repeated here for the load-deflection calibration study. 1. Select Analyze > Fit Y by X and highlight Deflection (in), and click Y, Response. Then select Load (lb) and click X, Factor. Finally, click OK. 2. From within the output window, click the contextual pop-up menu at the top of scatter plot and select Fit Polynomial > 2, quadratic. The quadratic line should appear in the scatter plot and a triangle with the label Polynomial Fit Degree = 2 should appear underneath the plot (Figure 7.39). 3. Click the triangle located right underneath the plot and select Confid Shaded Fit and Confid Shaded Indiv. The plot should now have two sets of bands surrounding the prediction line (Figure 7.39). Figure 7.39 Creating Shaded Confidence Bands in Fit Y by X

444 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide The final plot is shown in Figure 7.40 and it looks almost identical to the plot shown in Figure 7.39 without the shaded confidence intervals turned on. Why? For the calibration study, the fit of the quadratic model to our data has a very small RMSE = 0.000205177, which in turn results in a very small prediction error. That is why we cannot see the bands around the predicted values. Figure 7.40 Load-Defl ection Quadratic Model with Shaded Confi dence Bands

Chapter 7: Characterizing Linear Relationships between Two Variables 451 Load-Deflection Calibration curve Figure 7.46 shows a plot of Load as a function of Deflection. This is the calibration curve that can be used to predict the load using the measured deflection of the load cell. This curve is given by Equation 7.17a, which is one of the two solutions to the quadratic model given in equation 7.15. Figure 7.46 Load-Defl ection Calibration Curve The calibration curve is (7.18) For a vehicle placed in the load-cell, equation 7.18 gives the weight corresponding to the measured deflection. Note that for a deflection of 0 equation 7.18 gives a weight of 12.0376 lb. In theory we expect a 0 weight object to give a 0 deflection. This is an indication that the load cell might have a downward bias represented, in this case, by the intercept. In addition to the plot we can also present a table with key deflection values and their corresponding predicted weight. Table 7.14 shows the predicted weights for a sample of deflections.

452 Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide Table 7.14 Predicted Weights for Given Defl ections According to Equation 7.18 Deflection (in) Weight (lb) 0 12.04 0.5 682.43 1 1,381.17 1.5 2,084.26 2 2,791.79 2.4 3,361.06 Step 7: Interpret the results and make recommendations As we always recommend, we must translate our statistical findings in terms of the context of the situation at hand, so we can make decisions based on the outcomes of the analysis. Therefore, we should always go back to the problem statement in Step 1 and our scientific hypothesis to make sure that we answered the questions that we set out to answer with our study. Recall that you are preparing for the certification by a nationally recognized standards bureau for a new high-capacity canister style load cell entering this market. In order to come up to speed, you reproduced an existing calibration standard for equipment and vehicles up to 3,000 lb, based on a historical study circa 1975 by a National Institute of Standards and Technology (NIST) scientist named Paul Pontius. This standard is based on establishing the relationship between a specified load and the corresponding deflection measured by the load cell. After running a well-designed study, you derived a calibration curve, which depicts deflection (in) as a function of the load (lb). You discovered that a simple linear regression model was not adequate for your calibration curve. While the impact to the predicted deflection was relatively small, a quadratic term was added to the model not only to get a better fit to the data, but also to improve the prediction ability. To see this, you calculated the average value of the absolute errors with Y Yt under both models. Under the linear model Average ( Y Yt )= 0.00183, while under the quadratic model, Average ( Y Y t ) = 0.00016. This represents a one decimal place improvement in the absolute errors, and is smaller than the precision of the load cell, the pure error, = 4.61075E 8 = 0.00021 (Figure 7.35), which is practically significant. You also established an inverse prediction equation, equation 7.18, which can be hard coded into the load cell software to obtain the weight for any vehicle or heavy equipment appropriate for this class of load cell.