How do I run a penalty analysis with XLSTAT-MX?

Penalty analysis is a method used in sensory data analysis to identify potential directions for the improvement of products, on the basis of surveys performed on consumers or experts. Two types of data are used:

  • Preference data (or liking scores) that correspond to a global satisfaction index for a product (for example, liking scores on a 9 point scale for a chocolate bar), or for a characteristic of a product (for example, the comfort of a car rated from 1 to 10).
  • Data collected on a JAR (Just About Right) 5 point scale. These correspond to ratings ranging from 1 to 6 for one ore more characteristics of the product of interest. 1 corresponds not « Not enough at all », 2 to « Not enough », 3 to « JAR »  (Just About Right), an ideal for the consumer, 4 to « Too much » and 5 to « Far too much ». For example, for a chocolate bar, one can rate the bitterness, and for the comfort of the car, the sound volume of the engine.

The method, based on multiple comparisons such as those used in ANOVA, consists in identifying, for each characteristic studied on the JAR scale, if the rankings on the JAR scale are related to significantly different results in the liking scores.

An Excel sheet containing both the data and the results for use in this tutorial can be downloaded by clicking here. The data used in this example correspond to a survey where a given brand/type of potato chips has been evaluated by 150 consumers. Each consumer gave his opinion on JAR scale (1 to 5) for four attributes (Saltiness, Sweetness, Acidity, Crunchiness), and then gave an overall liking score on a 1-10 likert scale. Our goal is to identify some possible directions for the development of a new product.

After opening XLSTAT, select the XLSTAT/XLSTAT-MX/Penalty analysis command, or click on the corresponding button of the "XLSTAT-MX" toolbar (see below).


Once you've clicked on the button, the dialog box appears. We select the liking scores, and then the JAR data. The 3 levels JAR labels are also selected. They make the results easier to interpret.


In the "Options" tab, we define the threshold of the sample size below which the comparison tests won't be performed.


The following output options have been selected. The spearman correlation was chosen because the data are ordinal.


The computations begin once you have clicked on "OK". The results will then be displayed.

Interpreting the results of a penalty analysis

The first results are the descriptive statistics for the liking data and the various JAR variables. The correlations matrix is then displayed.


The correlations between the liking and JAR variables should not be interpreted as the ranks of the JAR data are not true ordinal data (5 is less than 3 on the JAR scale, while 5 is more than 3 on the liking scale). However if a correlation between a JAR variable and a liking variable is significantly different from 0, that could mean that the JAR variable has a low impact on the liking: if it had a strong impact, the correlation should ideally be 0. If the "too much" cases have a lower impact than the "too little", the correlation might be positive, and vice-versa for the negative correlations.

The next table is a summary of the JAR data. The chart that follows is based on that table and allows to quickly visualize how the JAR scores are distributed for each dimension.



The data are then aggregated into a 3 levels scale. The corresponding frequencies table and chart are displayed below.



The next table corresponds to the penalty analysis.


The following information is displayed for each JAR dimension:
- The name of the JAR dimension.
- The 3 collapsed levels of the JAR data.
- The frequencies corresponding to each level.
- The % corresponding to each level.
- The sum of the liking scores corresponding to each level.
- The average liking for each level.
- The mean drops for the "too much" and "too little" levels (this is the difference between the liking mean for the JAR levels minus the "too much" or "too little" levels. This information is interesting as it shows how many points of liking you loose for having a product "too much" or "too little" for a consumer.
- The standardized differences are intermediate statistic that is then used for the comparison tests.
- The p-values correspond to the comparison test of the mean for the JAR level and the means for the two other levels (this is a multiple comparison with 3 groups).
- An interpretation is then automatically provided, and depends on the selected significance level (here 5%).
- The penalty is then computed. It is a weighted difference between the means (Mean of Liking for JAR - Mean of Liking for the two other levels taken together). This statistic has given its name to the method. It shows how many points of liking you loose for not being as expected by the consumer.
- The standardized difference is an intermediate statistic that is then used for the comparison test.
- The p-value corresponds to the comparison test of the mean for the JAR level with the mean of the other levels. This is equivalent to testing if the penalty is significantly different from 0 or not.
- A interpretation is then automatically provided, and depends on the selected significance level (here 5%).

For the saltiness dimension, we see that the customer strongly penalize the product when they consider it not salty enough. Both mean drops are significantly difference from 0, and so is the overall penalty.

For the sweetness and acidity dimensions, none of the tests is significant. This is even truer for the sweetness.

For the crunchiness, the mean drops test could not be computed for the "too much" level because the % of cases in this level is lower than the 20% threshold set earlier. When the product is not crunchy enough, the product is highly penalized.

The next two charts summarize the results described above. When a bar is red it means the difference is significant, when it is green the difference is not significant, and when it is grey, the test was not computed because there were not enough cases.



Click here for other tutorials.

About KCS

Kovach Computing Services (KCS) was founded in 1993 by Dr. Warren Kovach. The company specializes in the development and marketing of inexpensive and easy-to-use statistical software for scientists, as well as in data analysis consulting.

Mailing list Join our mailing list

Home | Order | MVSP | Oriana | XLStat
QDA Miner | Accent Composer | Stats Books
Stats Links | Anglesey


Like us on Facebook Facebook

Get in Touch

  • Email:
  • Address:
    85 Nant y Felin
    Pentraeth, Isle of Anglesey
    LL75 8UY
    United Kingdom
  • Phone:
    (UK): 01248-450414
    (Intl.): +44-1248-450414