How do I run a multinomial goodness of fit test with XLSTAT?
XLSTAT multinomial goodness of fit test allows to compare expected and observed frequencies at the categories’ level for a qualitative variable (or a discretized quantitative variable). It is called multinomial goodness of fit test because it is based on the multinomial distribution.
When the theoretical frequencies of the categories of a qualitative variable are known, we can test using observed data the following null hypothesis: The distribution is not different from what is expected.
We use a simple example based on the distribution of the occupation status in France. Occupation status is separated into 8 categories (Farmers / Self-employed professionals / Professionals, managers, and intellectual professions / Office worker / Clerks / Workers / Inactive having worked / Other not working). We want to compare the frequencies obtained on a survey with 560 observations to the national level statistics. An Excel sheet with both the data and the results can be downloaded by clicking here.
Once XLSTAT-Pro is activated, select the "XLSTAT/Parametric tests/Multinomial Goodness of Fit test" command, or click on the corresponding button of the "Parametric test" menu (see below).
Once you have clicked the button, the dialog box appears. Select the data on the Excel sheet: select the column of data corresponding to the observed frequencies in the survey in the “Frequencies” box, and then select the proportion in the French population into the “Expected proportions” box. As the variables names are included in the first row of the selection, leave the "Column labels" option checked. Then activate the "Chi-square test" and the “Monte Carlo method”. For more details on the statistical methods please refer to the help of XLSTAT.
After you have clicked on the OK button, the results are displayed on a new Excel sheet (because the Sheet option has been selected for outputs).
Interpreting the results of a multinomial goodness of fit test
The first table displays the Chi-square statistic, the critical Chi-square, the number of degrees of freedom, and the corresponding p-value. The p-value tells us that the probability of rejecting the null hypothesis while it is correct is lower than 0.0001. In that case we can conclude that we can securely reject the null hypothesis that there is no difference between the observed and theoretical frequencies. We conclude that our sample do not match the population proportions. A method such as raking can be applied to the sample to get closer to the population.
The next table shows the Monte Carlo method’s results. It appears that it reaches the same conclusions as the standard method.
We have shown in this tutorial, that we can easily compare expected and observed frequencies of the categories of a qualitative variable using XLSTAT.
Click here for other tutorials.