Consumer Behavioural Buying Patterns on the Demand for Detergents Using Hierarchically Multiple Polynomial Regression Model

Similar documents
Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research

CORRELATION AND SIMPLE REGRESSION 10.0 OBJECTIVES 10.1 INTRODUCTION

Chapter 14 Student Lecture Notes 14-1

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Making sense of Econometrics: Basics

Bivariate Relationships Between Variables

Making sense of Econometrics: Basics

Chapter 14 Multiple Regression Analysis

STAT 212 Business Statistics II 1

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Fitting a regression model

Chapter 16. Simple Linear Regression and Correlation

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement

CHAPTER 4: Forecasting by Regression

Basic Business Statistics 6 th Edition

INTRODUCTORY REGRESSION ANALYSIS

Multiple Linear Regression estimation, testing and checking assumptions

Applied Econometrics (QEM)

Midterm 2 - Solutions

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

ST430 Exam 2 Solutions

Chapter 4: Regression Models

WISE International Masters

Chapter 3 Multiple Regression Complete Example

Homoskedasticity. Var (u X) = σ 2. (23)

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)

Multiple Regression Methods

The simple linear regression model discussed in Chapter 13 was written as

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Chapter 7 Student Lecture Notes 7-1

Correlation & Simple Regression

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

2 Functions and Their

Game Theory- Normal Form Games

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Chapter 4. Regression Models. Learning Objectives

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

Economics 203: Intermediate Microeconomics. Calculus Review. A function f, is a rule assigning a value y for each value x.

You are permitted to use your own calculator where it has been stamped as approved by the University.

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

STATE COUNCIL OF EDUCATIONAL RESEARCH AND TRAINING TNCF DRAFT SYLLABUS

LI EAR REGRESSIO A D CORRELATIO

Midterm 2 - Solutions

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Study Unit 3 : Linear algebra

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Stat 101 Exam 1 Important Formulas and Concepts 1

Statistics for Managers using Microsoft Excel 6 th Edition

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Dealing With Endogeneity

(MATH 1203, 1204, 1204R)

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models

Answer Keys to Homework#10

Quantitative Methods I: Regression diagnostics

EC4051 Project and Introductory Econometrics

Chapter 16. Simple Linear Regression and dcorrelation

MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

The Multiple Regression Model Estimation

Finding Relationships Among Variables

Homework 1 Solutions

Chapter 12 - Part I: Correlation Analysis

FAQ: Linear and Multiple Regression Analysis: Coefficients

Empirical Economic Research, Part II

Introductory Econometrics

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Sections 6.1 and 6.2: The Normal Distribution and its Applications

Basic Business Statistics, 10/e

ISQS 5349 Spring 2013 Final Exam

Linear Regression Measurement & Evaluation of HCC Systems


ECO 310: Empirical Industrial Organization Lecture 2 - Estimation of Demand and Supply

Introduction to Regression

MATH 236 ELAC FALL 2017 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Multicollinearity : Estimation and Elimination

Applied Statistics and Econometrics

BNAD 276 Lecture 10 Simple Linear Regression Model

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Simple Linear Regression

Machine Learning Linear Regression. Prof. Matteo Matteucci

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 4 Suggested Solution

Syllabus for MATHEMATICS FOR INTERNATIONAL RELATIONS

Intermediate Econometrics

School District of Marshfield Course Syllabus

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Econometrics Summary Algebraic and Statistical Preliminaries

Transcription:

DOI:.7/sciintl.4.64.7 Consumer Behavioural Buying Patterns on the Demand for Detergents Using Hierarchically Multiple Polynomial Regression Model H.J. Zainodin, Noraini Abdullah and G. Khuneswari School of Science and Technology, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, 884, Malaysia University of Glasgow, School of Mathematics and Statistics, G 8QQ Glasgow, United Kingdom ABSTRACT Background: Market studies on consumer preferences on product items had shown that consumer s behavioural patterns and intentions are sources of business profit level. In the advent wave of global businesses, the behavioural buying patterns of consumers have to be studied and analysed. Hence, this research illustrated the procedures in getting the best polynomial regression model of the consumer buying patterns on the demand for detergent that had included interaction variables. Methods: The hierarchically multiple polynomial regression models involved were up to the third-order polynomial and all the possible models were also considered. The possible models were reduced to several selected models using progressive removal of multicollinearity variables and elimination of insignificant variables. To enhance the understanding of the whole concept in this study, multiple polynomial regressions with eight selection criteria (8SC) had been explored and presented in the process of getting the best model from a set of selected models. Results: A numerical illustration on the demand of detergent had been included to get a clear picture of the process in getting the best polynomial order model. There were two single independent variables: the "price difference between the price offered by the enterprise and the average industry price of competitors' similar detergents (in US$) and advertising expenditure (in US$). Conclusion: In conclusion, the best cubic model was obtained where the parameters involved in the model were estimated using ordinary least square method. Key words: Hierarchically multiple polynomial regressions, ordinary least square, coefficient test, eight selection criteria, best cubic model Science International (): 64-7, 4 INTRODUCTION Regression analysis allows the researcher to estimate the relative importance of independent variables in influencing a dependent variable. It also identifies a mathematical equation that describes the relationship between the independent and dependent variables. According to Sandy, simple linear regression is a technique that is used to describe the effects of changes in an independent variable on a dependent variable. However, the Multiple Regression is a technique that predicts the effect on the average level of the dependent variable of a unit change in any independent variable while the other independent variables are held constant. According to Guarati, a model with two or more independent variables is known as a multiple regression model. The term multiple indicates that there are multiple independent variables involved in the analysis and multiple influences will affect the dependent Corresponding Author: H.J. Zainodin, School of Science and Technology, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, 884, Malaysia variable. According to Pasha using too few independent variables may give a biased prediction, while using too many independent variables, can cause a varied fluctuation prediction value. Most cost and production functions are curves with changing slopes and not a straight line with constant 4 slopes. The polynomial model enables the estimation of 5,6 these curves. In a polynomial model, independent variables can be squared, cubed or raised to any power or 7 exponential power. The studies of is an example of a 8 squared polynomial or quadratic model, while fitted a third order polynomial model of the best R correlation. The coefficients still appear in the regression in a linear fashion, allowing the regression to be estimated as usual. As in, for example, the polynomial model is given as: Y =β +β X +β X + u Since the same independent variables appear more than once in different forms, it will appear to be highly correlated. The effect of multicollinearity may not exist because the independent variables are not linearly correlated. 4 Science International 64

The main obective of this study is to obtain the best The Multiple Regression Model as defined in Eq. 9 model with a polynomial order which would represent is a hierarchically well-formulated models and can be the whole structure of the collected data so that further written as a system of n equations, as shown by the analysis can be carried out. There is no unique statistical following: procedure for doing this and personal udgment will be a necessary part of any of the statistical methods For i =, Y = S +SW +SW +... +SkW k+u discussed. METHODOLOGY A Hierarchical Multiple Regression Model relates a dependent variable Y, to several independent variables W, W,..., W. The model is in the following form: k Y = S +S W +S W +... +S W +S W +u () k- k- k k Basic assumptions of Multiple Regression Models are made about the error terms, u i (for i=,,..., n) and the values of independent variables W, W,, W are shown k in Table with S denotes the constant term of the model, S denotes the -th coefficient of independent variable W (for =,,, k) and u denotes the random residual of the model. The k denotes the number of independent variables (k+) denotes the total number of parameters. The general Multiple Regression Model with k independent variables is defined in Eq. where Y denotes the dependent variable, W denotes the -th independent variable (which can be single independent quantitative variable, or interaction variable (first-order interaction, second-order interaction, third-order interaction,...), or generated variable (dummy or/and polynomial or/and categorical variables) or transformed variable (Ladder transformation and Box-Cox transformation). The S denotes the constant term of the model, S denotes the -th coefficient of independent variable W (for =,,, k) and u denotes the random residual of the model. The k denotes the number of independent variables, (k+) denotes the total number of parameters. i =, Y = S +S W +S W +... +S W +u k k I=n, Y = S +S W +S W +... + S W +u n n n k kn n () This system of n equations can be expressed in matrix terms as the following: Where: Y = WS+u () Ω Y W... Wk u Y W... W Ω k u Ω..... Y = W = Ω=. u =............... Yn W n n... Wkn u n (k+ ) n n Ωk ( k+ ) The Y, S and u are in a form of vectors. The W is in a form of matrix. For example, the regression model for two single independent variables (X and X ) with possible interaction variable (X is the product of independent variables X and X) can be expressed as follows: Y = $ X X X X X X + $ X XX +u (4) This Eq. 4 can be written in general form as in Eq. : Table : Details of multiple regression assumptions Name Description Explanation Linearity of model Linear Linear in the unknown parameters, S for =,,..., k Some of the observed data are different Var(independent variable)> The variance of each W is greater than zero for =,,..., k. At least one of the observations in each independent variable is different Conditional mean of u, given W, is zero u is a random error variable for i =,,..., n E (u W, W,, W ) =. The resulting expected value of u, i k given independent variables W are given and hence can be treated as Cov(W s, u i) = The covariance of each independent variable is uncorrelated non random with u Homoscedasticity Var(u i W, W,,W k ) = F Variance of u i, given W, W,, W k has a constant value Serial independence Cov(u s, u i W, W,, W k ) = The covariance between us and ui are zero, given W, W,, W k, for all s t independently distributed Sample size is greater than number n>k Number of observations (n) is greater than the number of of parameters estimated parameters (k), n>k Normality of errors Error term is normally distributed u i is normally distributed, given independent variables W, W,, Wk 4 Science International 65

Y = S +SW +SW +SW +S4W 4+S5W 5+S6W 6+ S7W 7 +S8W 8+u Where: S =$ constant S =$ and W = X single independent S =$ and W = X single independent S =$ and W = X = XX (First-order interaction variable) S 4 =$ and W 4 = X = XX (Quadratic variable) S 5 =$ and W 5 = X = XX (Quadratic variable) S 6 =$ and W 6 = X = XXX (Cubic variable) S 7 =$ and W 7 = X = XXX (Cubic variable) S =$ and W = X X = X X X (interaction variable) 8 8 where, k = 8 and (k+) = 9. The variable XX denotes the product of variables X and X. According to Zainodin and Khuneswari, the variable XX can also be written as X and defined it as the first-order interaction variable. In general, the cross product between q single independent quantitative variables, the interaction variable is (q-)-order interaction variable. In the development of the mathematical model, there are four phases involved. These phases are possible models, selected models, best model and goodness-of-fit test. In the beginning, all the possible models are listed. Once this has been done, the next step is to estimate the coefficients for the entire possible model and then carry out the tests to get selected models. All the possible models must be run one by one to obtain selected models. In the process of getting the selected models from possible models, multicollinearity test and elimination procedure (elimination of insignificant variables) should be carried out in order to obtain a selected model. These selected models should be free from the multicollinearity sources and insignificant variables. Multicollinearity arises when two independent variables are closely linearly related. This is equivalent to saying that a coefficient of determination greater than.95 represents a strong linear relation. An absolute correlation coefficient greater than.95 (i.e., r >.95) defines a strong multicollinearity. The algorithm for the multicollinearity test procedures have also been described by Zainodin et al.. The Global test is then carried out as shown by Zainodin and Khuneswari. If there are no more multicollinearity source variables, then the next step is to carry out the elimination procedures on the insignificant variables. This is done by performing the coefficient test for all the coefficients in the model. According to Abdullah et al. the coefficient test is carried out for each coefficient by testing the coefficient of the corresponding variable with the value of zero. The insignificant variables will be eliminated one at a time. Subsequent removals are carried out on all the models until all the insignificant variables are eliminated. The Wald test is then carried out on all the resultant selected models so as to ustify the removal of the insignificant variables. A best model will be selected from a set of selected models that have been obtained from previous phases in the model-building procedures. The best model will be selected with the help of the eight selection criteria, (8SC). The best model is chosen when a particular model has the most of the least criteria value (preferably all the 8 criteria chose the same model). Finally, the residual analysis should be carried out on the best model to verify whether the residuals are randomly and normally distributed. Bin Mohd et al. 4 stated that randomness test should be carried out to investigate the randomness of the residuals produced. One of the MR assumptions is that the residuals should follow a normal distribution. Besides that, Shapiro- Wilk test is used (for n<5) to check the normality assumption of the residuals. Scatter plot, histogram and box-plot of the residuals are to a get a clear picture of distribution of the residual. These plots are used as supporting evidence in addition to the two quantitative tests. RESULTS This section will describe the process in getting the best polynomial regression model by including a numerical illustration. In this study, the data was collected from sale regions of detergent FRESH 5 during 989, as published in Bowerman and O Connell. In this study, the variables used are the demand (Y) for the detergent bottles to the factors affecting, such as the "price difference" (X ) between the price offered by the enterprise and the average industry prices of competitors' similar detergents and advertising expenditure (X ). The aim is to analyse what is the contribution of a specific attribute in determining the demand. Table shows the descriptive statistics for each 6 variable used. As stated by Crawley, the distribution of a variable is said to be normal if the value of skewness fall within [-.447,.447] and kurtosis fall within [-.8944,.8944]. There are two single quantitative independent variables involved in this data in order to determine the demand of detergent bottles. Figure shows the scatter plots of all the variables involved and their possible trends. Based on the plots in Fig., it could be seen significantly that Y and X have a polynomial shaped relationship. The fitted curves show a quadratic line relationship. The linear line between Y and X shows a linear relationship and it is supported with the Table that shows Y and X is highly correlated. This shows that there exists possible non linear variable that contributes 4 Science International 66

Table : Descriptive statistics for all independent variables Variables ------------------------------------------------------------------------------------------------------------------------------------ Statistics Y X X X X X X X Mean 8.87. 6.45.955 4.967.47.44 74.7 Standard Error (SE).44.45.44...854..68 Median 8.9. 6.65.4 4.9..8 9.859 Mode 8.75. 6.8.5 46.4..8 4.4 Standard deviation.68.74.579.66 7.5.56.668 67.95 Sample variance.464.57.7.6 5.87.445.45 455.7 Kurtosis (SE =.8944) -.858 -.797 -.88 -.6 -.5797 -.87.74 -.78 Skewness (SE =.447) -.8. -.848.5 -.7.97.59 -.59 Minimum 7. -.5 5.5. 7.565 -.7875 -.4 44.7 Maximum 9.5.6 7.5.6 5.565 4.5.6 8.78 Table : Details of models and their derivatives (interaction, quadratic and cubic) Model All possible models Basic all possible models M Y = $ X +u M Y = $ X +u M Y = $ X X +u M4 Y = $ X X X +u Quadratic variables added to models M and M M5 Y = $ X X +u M6 Y = $ X X +u M7 Y = $ X X X X +u M8 Y = $ X X X X X X X X XX X +u Cubic variables added to models M5 and M6 M9 Y = $ + $ X X X +u M Y = $ + $ X X X +u M Y = $ + $ X X X X X X +u M Y = $ X X X X X X X XX 4 4 X X XX X X +u X X Y Y X X Fig. : Scatter plots of all the variables involved and possible trends to Y. All the quadratic and cubic terms introduced in the models are thus shown in Table. Table shows all the possible models with interaction and polynomial variables. There are four all possible models (M-M4) for two single independent variable (X and X ). The polynomial and interaction with polynomial variables are added in model M to model M4. Therefore, there are possible models which include quadratic and cubic variables. Table 4 shows the Pearson correlations matrix for example, the possible model M8.. It can be seen that there exists multicollinearity among 9 pairs of independent variables (shaded values). The multicollinearity source variables should firstly be removed according to the Zainodin-Noraini Multicollinearity Remedial Techniques. Since there is a tie, the variable X (correlation with Y is.7) is removed because it has the smallest correlation coefficient with dependent variable (this is a case B of the Zainodin-Noraini multicollinearity remedial technique). After the removal of variable X, model then becomes model M8.. The correlation coefficient is recalculated as shown in Table 5. Here, it could be seen that there exists multicollinearity among the 7 pairs of independent variables (shaded values). There exist a tie and the variable X (correlation with Y is.876) is then removed because it has the smallest correlation coefficient with the dependent variable (this is again a case B). After removing variable X, the new model is model M8.. The correlation coefficient is recalculated as shown in Table 6. 4 Science International 67

Table 4: Pearson correlation for model M8. Y X X X X X X X X XX X Y.89.876.89.779.886.7.78.89.895 X.89.76.999.94.77.876.9.995.779 X.876.76.76.65.999.576.64.76.997 X.89.999.76.96.774.889.96.999.784 X.779.94.65.96.65.987.999.945.66 X.886.77.999.774.65.59.655.775.999 X.7.876.576.889.987.59.988.899.64 X X.78.9.64.96.999.655.988.945.669 XX.89.995.76.999.945.775.899.945.787 X.895.779.997.784.66.999.64.669.787 Correlation is significant at the. level (-tailed) Table 5: Pearson correlation for model M8. Y X X X X X X X XX X Y.89.876.89.779.886.78.89.895 X.89.76.999.94.77.9.995.779 X.876.76.76.65.999.64.76.997 X.89.999.76.96.774.96.999.784 X.779.94.65.96.65.999.945.66 X.886.77.999.774.65.655.775.999 X X.78.9.64.96.999.655.945.669 X X.89.995.76.999.945.775.945.787 X.895.779.997.784.66.999.669.787 Table 6: Pearson correlation coefficient for model M8. Y X X X X X X XX X Y.89.89.779.886.78.89.895 X.89.999.94.77.9.995.779 X.89.999.96.774.96.999.784 X.779.94.96.65.999.945.66 X.886.77.774.65.655.775.999 X X.78.9.96.999.655.945.669 XX.89.995.999.945.775.945.787 X.895.779.784.66.999.669.787 Table 7: Pearson correlation coefficient for model M8. Table 8: Pearson correlation coefficient for model M8.6 Y X X X X X X X Varaibles Y X X X X Y.89.89.779.886.78.895 Y.89.78.895 X.89.999.94.77.9.779 X.89.96.784 X.89.999.96.774.96.784 X X.78.96.669 X.779.94.96.65.999.66 X.895.784.669 X.886.77.774.65.655.999 X X.78.9.96.999.655.669 X.895.779.784.66.999.669 Here, it could be seen that there exists multicollinearity among the 5 pairs of independent variables. The most commonly appearing variables are X, X and X X. Since there is a tie, the variable XX (correlation with Y is.89) is removed because it has the smallest correlation coefficient with dependent variable (this is again a case B). The new model after removing variable XX is model M8.. Then the correlation coefficient was recalculated as shown in Table 7. From Table 7, it can be seen that there exists multicollinearity among the pairs of independent variables. There are no common variables among the multicollinearity source variables. Therefore, the variable with the smallest correlation coefficient with dependent variable in each pair should be removed. The three variables X, X and X are removed from the model M8. (this is a case C where frequency of all the variables is one). The new model after removing the variables X, X and X is then model M8.6. The correlation coefficient is thus recalculated as shown in Table 8. Based on the highlighted triangle in Table 8, it could be seen that there is no more multicollinearity left in the model. The next step is to carry out the coefficient test of the elimination procedures on the insignificant variables and the Wald test on model M8.6.. The Wald test is carried out to ustify the elimination process. There is only one variable that is omitted from the model M8.6 as shown in Table 9. Table shows the Wald test for model M8.6.. From Table, F cal is.7 and F = F(, 6,.5) = 4. from the F-distribution table. table 4 Science International 68

Table 9: Illustration of elimination procedure in getting selected model M8.6. Models ------------------------------------------------------------------------------------------ Coefficients M8.6 M8.6. Constant 6.6986 6.646 $.877 (.9457).8 (4.8555) $.49 (4.554).5 (5.5) $ -.8 (-.8489) - SSE.96.4 *t critical is.5 and value in parentheses is the tcal Table : Wald test for model M8.6. Source of variations (R-U) Sum of squares df Mean sum of squares F cal Differences.8.8.7 Unrestricted (U).96 6.55 Restricted (R).4 7 Table : Corresponding selection criteria values for the selected models Selected models (k+) AIC FPE GCV HQ RICE SCH WARZ SGMA SQ SHI BATA M...69.69.74..79.7..6 M...9.9.98.9.5..9.8 M...64.65.6.65.69.78.568.6 M4...65.65.6.65.69.79.568.64 M7...598.598.64.65.6.688.544.587 M8.6. 5.648.65.668.698.696.88.557.69 M.....6.5...958. M...574.575.58.6.588.66.5.564 Since F is less than F, H is accepted. The removal of cal table insignificant variables in the coefficient test is therefore ustified. The selected model M8.6. is: Ŷ = 6.646+.4X +.5X Similar procedures and tests (Global test, Coefficient test and Wald test) are carried out on the remaining selected models. There are 8 selected models obtained in this phase. For each selected model, the eight selection criterion (8SC) values are obtained and the corresponding values are shown in Table. It can be seen from Table that all the criteria (8SC) indicate that model M.. as the best model. Thus, the best model is M.. is given by: Table : Interpretation of coefficients for best model M.. Variables Coefficients Comments X.467 "Price difference" between the price offered by the enterprise and the average industry price of Competitors' Similar detergents Main factor One unit increase in X will increase the number of detergent demanded (Y) by.467 bottles X.5 Advertising expenditure Cubic factor One unit increase in X will increase the number of detergent demanded (Y) by.5 bottles Ŷ = 6.698+.467X +.5X Referring to Table 4, the Pearson correlation shows the correlation where the "price difference" (X ) (i.e., r =.89) between the price offered by the enterprise and the average industry price of competitors' similar detergents, are highly correlated with the demand for detergent bottles (Y). The components of model M.. further indicate the presence of polynomial-order (cubic) of the significant independent variables. The variable X (i.e., r =.8949) are highly correlated with the demand for detergent bottles (Y). The interpretations of the coefficients for best model M.. are depicted in Table. Fig. : Scatter plot of standardized residuals for best model M.. Based on the best model, the residuals analyses are obtained. Several tests on the model s goodness-of-fit are carried out. It is found that all the basic assumptions are satisfied and the residuals plots are shown in Fig. -4. Using the residuals obtained, randomness test and the normality test are carried out. Both randomness test and residuals scatter plot, as shown in Fig. and, indicate 4 Science International 69

Fig. : Histogram of standardized residuals for best model M.. Fig. 4: Box-plot of residuals for best model M.. that the residuals are random, independent and normal. The total sum of residual of the best model, M.. is.7 while the sum of square error is.4. The randomness test carried out on residuals shows that resulting error term of best model M.. is random and independent. This strengthens the belief which is reflected in the residuals plot of Fig. which confirms that no obvious pattern exists. This shows that the best model M.. is an appropriate model in determining the demand of detergent. Besides that by taking ± standard deviation (i.e., 99.7%) for Upper Control Limit (UCL) and Lower Control Limit (LCL) as in Fig., the residuals are distributed between the ± standard deviation lines which indicate that there are no outliers. The Shapiro-Wilk statistics of the normality plot in Fig. shows that the residuals of model M.. are distributed normally (i.e., statistics =.9865, df = and p-value =.96). Figure 4 with its median positioned at the centre of the box further shows that model M.. is a well represented model to describe the demand of detergents. Thus, the model is ready to be used for further analysis. Now the demand model is ready for use in forecasting or estimating to make a logical decision in determining the appropriate demand of detergent. DISCUSSIONS Other goodness-of-fit tests can be used such as the root-mean square error (RMSE), Modelling 7,8 Efficiency (EF) and many others. However, in this research, the residuals analysis is based on the randomness and normality tests. The numerical illustration of this study shows that polynomial regression model M.. is the best model to describe the demand of detergent. The "price difference" (=X ) between the price offered by the enterprise and the average industry price of competitors' with similar detergents is the main factor in determining the demand of detergent and the advertising expenditure (=X ) as the cubic factor. The best model M.. (i.e., Ŷ = 6.698+.467X +.5X ) in Table shows that constant demand without any contributions of other factors is approximately 7 bottles (since the constant value is 6.698). Every one unit increase in price difference will directly increase the number of detergent by.467 bottles and every one unit increase in advertising expenditure will positively increase the number of detergent by.5 bottles, respectively. This means that if price difference and expenditure are increased by one unit, then the number of detergent demanded will be 8 bottles (approximate value). 9 Similar to Chu, the cubic polynomial approach is used to forecast the demand of detergent. had also attempted the use of price in the consumer purchase decision using price awareness with three distinct factors, namely, price knowledge, price search in store and between stores and the Dickson-Sawyer method. The effect of advertising tools on the behaviour of consumers of detergents had also been surveyed and compared by Salavati et al. in developing countries. Moreover, advance researches on promoting hygiene, such as hand hygiene, with criteria like fast and timely, easy and skin protection, are part of the campaigns involved in advertising expenditure. 4 Science International 7

CONCLUSIONS. Zainodin, H.J., A. Noraini and S.J. Yap,. The consumer buying behaviour will affect the An alternative multicollinearity approach in solving demand for detergents. It is imperative in today s diverse multiple regression problem. Trends Applied global market that a firm can identify consumer Sci. Res., 6: 4-55. behavioural attributes and needs, lifestyles and purchase. Abdullah, N., Z.H.J. Jubok and J.B.N. Jonney, 8. processes and the influencing factors which are Multiple regression models of the volumetric stem responsible for the consumer decisions. Hence, devising biomass. WSEAS Trans. Math., 7: 49-5. good marketing plans is necessary for a firm. While. Ramanathan, R.,. Introductory Econometrics serving its targeted markets, it minimises dissatisfaction with Application. 5th Edn., Harcourt College and still stay ahead of other competitors. It was seen from Publishers, South Western, ISBN: 978486, the best model that there is conclusive increase in Pages: 688. production due to price difference and expenditure. In 4. Bin Mohd, I., S.C. Ningsih and Y. Dasril, 7. other words, mathematically, a detergent manufacturing Unimodality test for global optimization of single company has to proect its marketing and production variable functions using statistical methods. plans to incorporate the impacts of pricing and Malaysian J. Math. Sci., : 5-5. expenditure on the consumers. Indirectly, the firm 5. Bowerman, B.L. and R.T. O?Connell, 99. Linear succeeded in meeting the demands of the consumers in Statistical Models and Applied Approach. nd Edn., a win-win relationship. PWS-Kent Publishing Company, Boston. 6. Crawley, M.J., 5. Statistics: REFERENCES An Introduction Using R. John Wiley and Sons Ltd.,. Sandy, R., 99. Statistics for Business and England, pp: 55-85. Economics. McGraw-Hill Inc., New York, USA. 7. Mohammadi, A., S. Rafiee, A. Keyhani and. Guarati, D.N., 6. Essentials of Econometrics. Z. Emam-Domeh, 9. Moisture content rd Edn., McGraw-Hill Inc., New York, USA. modeling of sliced kiwifruit (cv. Hayward) during. Pasha, G.R.,. Selection of variables in multiple drying. Pak. J. Nutr., 8: 78-8. regression using stepwise regression. J. Res. 8. Meisami-Asl, E., S. Rafiee, A. Keyhani and Bahauddin Zakariya Univ. Multan Pak., : 9-7. A. Tabatabaeefar, 9. Mathematical modeling of 4. Smith, R.T. and R.B. Minton, 7. Calculus: Early moisture content of apple slices (Var. Golab) during Transcendental Functions. rd Edn., McGraw-Hill, drying. Pak. J. Nutr., 8: 84-89. Inc., New York, USA., ISBN:9787758. 9. Chu, F.L., 4. Forecasting tourism demand: 5. Beach, E.F., 95. The use of polynomials to A cubic polynomial approach. Tourism Manage., represent cost functions. Rev. Econ. Stud., 5: 9-8. 6: 58-69.. Kenesei, Z. and S. Todd,. The use of price in 6. Wackerly, D.D., W. Mendenhall and R.L. Scheaffer, the purchase decision. J. Empirical Generalisations. Mathematical Statistics with Applications. 6th Market. Sci., 8: -. Edn., Thomson Learning, South Western Ohio,. Dickson, P.R. and A.G. Sawyer, 99. The price USA. knowledge and search of supermarket shoppers. 7. Al-Hazza, M.H.F., E.Y.T. Adesta, A.M. Ali, J. Market., 54: 4-5. D. Agusman and M.Y. Suprianto,. Energy cost. Salavati, A., R. Shafei and M.Y. Mazhari,. modeling for high speed hard turning. J. Applied Surveying and comparing the effect of advertising Sci., : 578-584. tools on the behavior of consumers of detergents 8. Fonseca, G.G., P.S. da Silva Porto and (case study in developing countries). Eur. J. Soc. L.A. de Almeida Pinto, 6. Reduction of drying Sci., 7: 96-8. and ripening times during the Italian type salami. Pittet, D.,. Improving adherence to hand production. Trends Applied Sci. Res., : 54-5. hygiene practice: A multidisciplinary approach. 9. Jaccard, J.,. Interaction Effects in Logistics Emerging Inf. Dis., 7: 4-4. Regression. SAGE Publication, London, UK., ISBN-: 9787697.. Zainodin, H.J. and G. Khuneswari, 9. A case study on determination of house selling price model using multiple regression. Malaysian J. Math. Sci., : 7-44. 4 Science International 7