How do I run a nonlinear multiple regression with XLSTAT?
An Excel sheet with both the data and the results can be downloaded by clicking here. The data have come from an experiment where too components are added to a moisturizing cream. The purpose is to study the effect of the concentration of two components, C1 and C2, on the viscosity of a yogurt. The model that we want to fit writes:
F(C1, C2) = pr5 / (1+Exp(-pr1-pr2*C1-pr3*C2-pr4*C1*C2))
pr1, ..., pr5 are the parameters of the model that we want to estimate. This logistic like model allows to take into account both the concentrations of the products and the interactions between the products.
After opening XLSTAT, select the XLSTAT/Modeling data/Nonlinear regression command, or click on the corresponding button of the "Modeling Data" toolbar (see below).
Once you've clicked on the button, the nonlinear regression dialog box appears. Select the data on the Excel sheet. The "Dependent variable" (or response variable) is in our case the "Viscosity". The quantitative explanatory variables are the concentration of the two components C1 and C2. As we selected the column headers, we left the option "Variable labels" option activated. We left the "Residuals" option activated as well, because we want to analyze the predictions and the residuals.
In the "Options" tab we selected the values of the initial values of the five parameters.
In the "Functions" tab, the various functions are displayed. As the function we want to use is not listed in the "Preprogrammed functions" (you can notice the univariate version of the function in the list), we needed to enter the model: we first clicked on "Add", then entered the function, then checked "Derivatives", then selected them on the Excel sheet. In order to add this function to the user functions library, we clicked on "Save". The function is then automatically added and selected.
The computations begin once you have clicked on the "OK" button. The results will then be displayed.
Interpreting the results of a non linear multiple regression
The first table gives the basic statistics of the selected variables.
The second table (see below) displays the goodness of fit coefficients, including the R² (coefficient of determination), and the SSE (sum of square of errors), the later being the criterion used for the model optimization. The R² corresponds to the % of the variability of the dependant variable (the dry weight) that is explained by the explanatory variable (the time). The closer to 1 the R² is, the better the fit.
In our case, 99% of the variability is explained by the two variables and their interaction, which is an excellent result that confirms that the selected model is appropriate.
The next table shows the results for the model parameters. As we can see, the ratios (parameter)/(std deviation) are larger for pr5 and pr4. As the same ratio is the largest for pr5 we deduce that the interaction between the two components has a greater effect on the viscosity than the concentrations themselves.
The following chart allows to visualize the quality of the fit by comparing the predicted values to the observed values.
Click here for other tutorials.