# How can I generate many distributions in a simulation model efficiently by copying?

An Excel sheet containing both the data and the results for use in this tutorial can be downloaded by clicking here.

Simulation models

Simulation models allow to obtain information, such as mean or median or confidence intervals, on variables that do not have an exact value, but for which we either know or assume a distribution. If some "result" variables depend on these "distributed" variables by the way of known or assumed formulae, then the "result" variables will also have a distribution. XLSTAT-Sim allows you to define the distributions, and then to obtain, through simulations, an empirical distribution of the input and output variables as well as the corresponding statistics.

Simulation models are used in many areas such as finance and insurance, medicine, oil and gas prospecting, accounting, or sales prediction.

Four elements are involved in the construction of a simulation model:

- Distributions are associated to random variables. XLSTAT gives a choice of more than 30 distributions to describe the uncertainty on the values that a variable can take. For example, you can choose a triangular distribution if you have a quantity for which you know it can vary between two bounds, but with a value that is more likely (a mode). At each iteration of the computation of the simulation model, a random draw is performed for each distribution that has been defined.

- Scenario variables allow to include in the simulation model a quantity that is fixed in the model, except during the tornado analysis where it can vary between two bounds.

- Result variables correspond to outputs of the model. They depend either directly or indirectly, through one or more Excel formulae, on the random variables to which distributions have been associated and, if available, on the scenario variables. The goal of computing the simulation model is to obtain the distribution of the result variables.

- Statistics allow to track a given statistic for a result variable. For example, we might want to monitor the standard deviation of a result variable.

A correct model should comprise at least one distribution and one result variable. Models can contain any number of these four elements. A model can be limited to a single Excel sheet or can use a whole Excel folder.

In this tutorial a simulation model is created that simulates the payment of interest of a in fine loan over 5 years and uses the interest rate of the first year to calculate the net profit value. During this tutorial the copy of distribution cells using the normal Excel copy and paste function is introduced.

Our simulation model deals with the payments of interest of a in fine loan. The interest are calculated during 5 years. At the end the net profit value at the initial time is calculated using the Excel function NPV and the interest rate of the first year and the payments of interest during the 5 years. The interest rate is support to be equally distributed between 3.5% and 5.5%. The capital of the in fine loan is 10000 Euro.

Starting with a static model using a mean interest rate of 4.5%. The net profit value is calculated to 1975 Euro in that case.

This model can be found in the sheet Model.

In the following we use relative references to copy the distributions correctly. Please verify in the XLSTAT-Sim Options, that the corresponding option has been selected.

Choose the first distribution variable B6 that is interest rate for 2008 as active cell.

Once XLSTAT is activated, select the XLSTAT/ XLSTAT-SIM /Define distribution command, or click on the corresponding button of the "XLSTAT-SIM" toolbar (see below).

The define distribution dialog box will appear. Then select the data on the Excel sheet. Choose the Excel cell with the name "2008" as name. This will be integrated as a relative reference in the format A1 in the formula. Choose an uniform distribution with a = 0.035 and b = 0.055.

Once you have clicked on "OK", the corresponding function call of XLSTAT_SimDist is inserted into the active cell.

It is possible to enter the other four distributions using copy and paste of the cell that we have generated into the four cells at the right of this first cell. You have as well the possibility like for other Excel formulas to select the cell B6 that you have just generated, go with the mouse over the lower left corner where the cursor is displayed as a black cross. Press and hold the left mouse button now and mouse the mouse up to the cell F6. In this way you have defined the 5 cells as well. The names of the distributions will be "2008, …, 2012".

Choose the result cell B9 that contains the formula = NPV(B6,B7,C7,D7,E7,F7) as active cell. Now the result variable will be defined. Select the XLSTAT/ XLSTAT-SIM /Define result command, or click on the corresponding button of the "XLSTAT-SIM" toolbar enthält.

The define result dialog box will appear. Then select the data on the Excel sheet. Choose the Excel cell with the name "NPV" as name in it.

Once you have clicked on "OK", the corresponding function call of XLSTAT_SimRes is inserted into the active cell.

This can be found in the Excel sheet Model.

## Running the simulation

To start the simulation run, select the XLSTAT/ XLSTAT-SIM /Simulation - Run command, or click on the corresponding button of the "XLSTAT-SIM".

The simulation - run dialog box will appear. Then select the data on the Excel sheet. Set the number of simulations to 1000.

In the "Options" tab, enter the parameters of the tornado and spider analysis. Select the standard cell value as default value. Choose 10 data points in the interval from -10% up to +10% of the value deviation:

The computations begin once you have clicked on "OK".

## Interpreting the results of the simulation

You are asked to confirm the number of rows and columns (this message can be bypassed by un-selecting the "Ask for selections confirmation" in the XLSTAT options panel).

The first result is a summary of the constructed model. There are details of the distribution variables and the result variable displayed.

The following tables show details for the distribution variables. Descriptive statistics, a histogram and statistics about the intervals are displayed.

The following tables show details for the result variables. Descriptive statistics, a histogram and statistics about the intervals are displayed. Then the results of the sensitivity analysis are shown. These results depend on the iterations of the simulations.

The next section contains the tornado analysis.

Tornado and spider analyses are not based on the iterations of the simulation but on a point by point analysis of all the input variables (random variables with distributions and scenario variables).

During the tornado analysis, for each result variable, each input random variable and each scenario variable are studied one by one. We make their value vary between two bounds and record the value of the result variable in order to know how each random and scenario variable impacts the result variables. For a random variable, the values explored can either be around the median or around the default cell value, with bounds defined by percentiles or deviation. For a scenario variable, the analysis is performed between two bounds specified when defining the variables. The number of points is an option that can be modified by the user before running the simulation model.

The spider analysis does not only display the maximum and minimum change of the result variable, but also the value of the result variable for each data point of the random and scenario variables. This is useful to check if the dependence between distribution variables and result variables is monotonous or not.

In the first table the minimal and maximal change and the corresponding range are displayed for each distribution variable. In this case all interest rates are more or less the same. In the spider analysis in the next section we see that the interest rate of the first year has less impact than the other interest rates, because in the formula of NPV the interest rate of the first year is used as well and therefore variations of this interest rate to not have much influence on the NPV.

Kovach Computing Services (KCS) was founded in 1993 by Dr. Warren Kovach. The company specializes in the development and marketing of inexpensive and easy-to-use statistical software for scientists, as well as in data analysis consulting.