Propagating Uncertainty in Inputs to Uncertainty in Outputs (Excel Computer Exercise)
Introduction The following model can be used to estimate risk of infection from ingesting cryptosporidium oocysts:
Risk = 1-exp(-dose x k)
Dose is modeled as:
dose= concentration x ingestion
Suppose ingestion is considered a known 0.13 L/swim. Uncertainty in the dose response parameter k and the concentration are to be modeled using probabilistic techniques, specifically Monte Carlo techniques. A literature survey is undertaken to develop a range of reasonable values for each input. The dose response parameter k is estimated to be 0.0572 with an uncertainty factor of 2; it is 95% likely that the true value is less than 2*0.0572 and greater than 0.0572/2. Concentration is estimated to be 10 oocysts/liter with an uncertainty factor of 1.5; it is 95% likely that the true value is less than 1.5*10 and greater than 10/1.5. The assumption of multiplicative uncertainty is common and is conveniently represented by a lognormal distribution.
1. Take the natural log of the most likely value of k. This is the mean of the lognormal uncertainty distribution.
2. Take the natural log of the upper bound of k. This is the 97.5 percentile value of the lognormal uncertainty distribution for k.
3. Calculate the log standard deviation as (log97.5percentile-logmean)/1.96
4.You now have the two parameters you need to describe a lognormal distribution for k, the logmean and logstandard deviation. Repeat this process to find the logmean and logstandard deviation for concentration.
5. Label four separate columns in an Excel sheet as "ln Mean Concentration", "Mean Concentration", "ln k", and "k". Then make sure the data analysis add-in is installed by going to “Tools” selecting “Add-ins” and checking the “Analysis Toolpak”. Then select “Data Analysis” from the “Tools” menu and pick “Random Number Generation”. (For Office 2007 go to the Office button, select “add ins”, “manage Excel add ins”, and check “Analysis Toolpak”. Then selected “Data Analysis” from the “Data” tab and pick “Random Number Generation”.)
6.This will bring up a dialogue box and you can enter the appropriate distribution type (normal) and parameter values (determined above).
7.By definition the log of the values generated from a lognormal distribution are normally distributed. Therefore, if values simulated from a normal distribution are exponentiated, the transformed values will be lognormally distributed. Rather than directly generating lognormally distributed values, it is usually best to generate normal values and exponentiate them. Thus you should simulate normal values from a distribution with mean equal the logmean you determined above and standard deviation equal to the log standard deviation determined above. Simulate at least 1000 values for each input. Align the inputs in the same rows in adjacent columns. Set the output range to put values for the log concentration in the column labeled "ln Mean Concentration" and put values for natural log of k in the column labeled "ln k".
8. Exponentiate the values you simulated to obtain the quantities of interest (e.g., exponentiate the contents of column labeled "ln Mean concentration" and put it in the column labeled "Mean concentration". Then label a column "Risk" and put in this column the formula to compute the infection risk from the exponential dose response model using the k value and concentration values from the appropriate columns.
Complete the following:
A. Determine the mean, median, 5th percentile, 95th percentile, and standard deviation of the infection risk from the Monte Carlo simulation.
B. Plot the sampled values for each input to the Monte Carlo analysis (on the x-axis) vs. risk (on the y-axis) and a comparison of the correlation coefficients. You will produce 2 plots: risk vs. k and risk vs. concentration. Which input contributes the most uncertainty to the output?
Solutions are not yet posted on the Wiki.