From charlesreid1

Overview

Monte Carlo sampling is essentially a brute-force technique in which random samples are taken until confidence that the entire space has been sampled is satisfactory.

Random numbers are used to create sampling points in each direction.

Think of Monte Carlo ray-tracing: you send out a whole bunch of rays, each in random directions, and from the result you determine the radiative flux. Mathematically, you're performing an integration by randomly sampling the function you want to integrate, then adding up all of the random samples:

Explanation

Transforming Variables

For a distribution that is a function of variables :

Each variable has its own range,

This range must be converted to

so that

Log Scale

If you're using a log scale, i.e. sampling logarithmically more at than

HistogramNormalSpace.png HistogramLogSpace.png
Histogram of samples in normal, non-transformed space. Samples are preferentially grouped toward lower end of range. When the samples are plotted in the logarithm-transformed space, the preferential grouping disappears, as each bin now represents an order of magnitude, which drastically stretches the higher-end bins (so they include more samples) and squishes the lower-end bins (so they include fewer samples).

Selecting Samples

An -element vector of random numbers is generated to correspond with a single sample.

A number of sample points are selected, and the and corresponding are stored

the function is evaluated at each input variable value

all random input vectors and corresponding output vectors are stored/saved

Running in Parallel

Because the code suited itself to being run in parallel, I was able to use the Matlab Parallel Computing toolbox by changing the for loop over all samples that evaluates the function at each sample.

help parfor

Analysis of Results

Example

Problem Information

For details about the problem, including the input uncertainty map, see Example Problem for Experimental Design

What I am trying to learn:

  1. What does "true" function look like via MC sampling?
  2. How many samples are required for MC?
  3. How high a degree does the polynomial go to, and is that a function of the number of samples?
  4. How does the polynomial compare to a Fourier analysis?

Plan: save the state after 10, 100, 1k, 10k samples and evaluate after each.

Code

Computing Response Surface

See Response Surface Methodology for general information on response surface methodology.

A Note on Visualization

See Composite Experimental Design#A Note on Visualization

A Note on Coefficient and Variable Order

See Composite Experimental Design#A Note on Coefficient and Variable Order

Polynomial Powers Matrix

See Composite Experimental Design#Polynomial Powers Matrix

Quadratic Surface, 6 Dimensions

Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_2deg.mat

A quadratic response surface was computed using all of the information from the Monte Carlo samples. There were 10,000 samples in total.

The resulting polynomial coefficient vector is:

b(1) = 175.9 
b(2) = -55.47 
b(3) = -59.65 
b(4) = -75.2 
b(5) = 1.12 
b(6) = 0.1184 
b(7) = -44.47 
b(8) = 4.065 
b(9) = -0.8079 
b(10) = 5.412 
b(11) = -10.95 
b(12) = 25.88 
b(13) = 21.01 
b(14) = -0.1112 
b(15) = 0.2904 
b(16) = -0.3872 
b(17) = -0.003535 
b(18) = -0.03342 
b(19) = -0.009973 
b(20) = -0.01873 
b(21) = 0.0003347 
b(22) = -0.0003496 
b(23) = 21.21 
b(24) = 16.26 
b(25) = 21.64 
b(26) = -0.5508 
b(27) = 0.0131 
b(28) = -10.72

for the polynomial powers matrix:

     0     0     0     0     0     0
     0     0     0     0     0     1
     0     0     0     0     1     0
     0     0     0     1     0     0
     0     0     1     0     0     0
     0     1     0     0     0     0
     1     0     0     0     0     0
     0     0     0     0     0     2
     0     0     0     0     1     1
     0     0     0     0     2     0
     0     0     0     1     0     1
     0     0     0     1     1     0
     0     0     0     2     0     0
     0     0     1     0     0     1
     0     0     1     0     1     0
     0     0     1     1     0     0
     0     0     2     0     0     0
     0     1     0     0     0     1
     0     1     0     0     1     0
     0     1     0     1     0     0
     0     1     1     0     0     0
     0     2     0     0     0     0
     1     0     0     0     0     1
     1     0     0     0     1     0
     1     0     0     1     0     0
     1     0     1     0     0     0
     1     1     0     0     0     0
     2     0     0     0     0     0

The response surface, plotted with the non-visualized dimensions set to a constant (their means), is:

MCResponseSurface 6dim 2deg.png

and some important statistics are:

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 6. 
Number of terms in polynomial is 28. 
Degree of response surface is 2.

MSE =			 0.04265621 
MSE DoF = 		 5007 

L-inf norm resid = 	 0.53414457 

R^2 =			 0.68956066 
adjusted R^2 =		 0.68788663 
---------------------------------------------------


Quadratic Surface, 2 Dimensions

Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_2dim_2deg.mat

The same set of Monte Carlo samples was fit to a quadratic surface, but with 2 variables instead of 6.

The resulting polynomial coefficient vector is:

b(1) = 0.2606 
b(2) = -0.01793 
b(3) = 0.0367 
b(4) = -0.003453 
b(5) = 0.0003092 
b(6) = -0.0003485 

for the polynomial powers matrix:

     0     0
     0     1
     1     0
     0     2
     1     1
     2     0

This results in a response surface that looks similar to the 6-dimensional quadratic response surface:

MCResponseSurface 2dim 2deg.png

The statistics show that the fit is better for the 2-dimensional surface than for the 6-dimensional surface. This, combined with the fact that he response surfaces look similar, means we can conclude that the additional dimensions are probably independent of the two visualized dimensions, or that they ave a minimal impact on the response.

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 2. 
Number of terms in polynomial is 6. 
Degree of response surface is 2.

MSE =			 0.04267264 
MSE DoF = 		 5029 

L-inf norm resid = 	 0.50344056 

R^2 =			 0.68807653 
adjusted R^2 =		 0.68776641 
---------------------------------------------------

Cubic Surface, 6 Dimensions

Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_3deg.mat

The polynomial coefficient vector is:

b(1) = 9.4335e+04 
b(2) = -7.1360e+04 
b(3) = -1.3930e+04 
b(4) = 2.4439e+04 
b(5) = 6.3177e+01 
b(6) = -7.3399e-01 
b(7) = -4.7962e+04 
b(8) = 2.5084e+04 
b(9) = 1.2428e+04 
b(10) = 9.7792e+02 
b(11) = -9.2480e+03 
b(12) = -5.1276e+03 
b(13) = -5.7739e+03 
b(14) = -1.3153e+02 
b(15) = 1.5852e+01 
b(16) = -1.7894e+01 
b(17) = 7.7344e-01 
b(18) = -1.1606e+00 
b(19) = -1.8691e+00 
b(20) = 5.6333e+00 
b(21) = 5.4130e-02 
b(22) = -3.5192e-03 
b(23) = 1.7109e+03 
b(24) = -1.8055e+03 
b(25) = -6.1622e+03 
b(26) = 9.2918e+01 
b(27) = 2.2992e+00 
b(28) = 2.4321e+04 
b(29) = -3.2712e+03 
b(30) = -2.0094e+03 
b(31) = -8.4970e+02 
b(32) = 6.7127e+02 
b(33) = 5.4902e+02 
b(34) = 1.6649e+03 
b(35) = 3.9418e+01 
b(36) = -8.6106e+01 
b(37) = 2.8060e+03 
b(38) = 8.4698e+02 
b(39) = 4.0922e+01 
b(40) = -4.1602e+01 
b(41) = 3.6996e+01 
b(42) = 2.7639e+01 
b(43) = -3.6318e+01 
b(44) = 1.9221e+01 
b(45) = -1.3503e-01 
b(46) = 4.1899e-01 
b(47) = 3.5445e-02 
b(48) = 3.4742e-03 
b(49) = 1.0020e-01 
b(50) = 9.6863e-01 
b(51) = 1.0226e-01 
b(52) = -1.0614e+00 
b(53) = -7.4942e-01 
b(54) = -6.7971e-01 
b(55) = -2.5682e-02 
b(56) = 1.6590e-03 
b(57) = 6.0159e-04 
b(58) = -3.3300e-04 
b(59) = 6.4587e-04 
b(60) = 4.1078e-04 
b(61) = 8.1751e-04 
b(62) = -4.3508e-06 
b(63) = 1.0649e-05 
b(64) = 1.0716e+03 
b(65) = -3.1054e+02 
b(66) = -9.7221e+02 
b(67) = 2.0289e+03 
b(68) = -9.5553e+02 
b(69) = 2.5068e+02 
b(70) = -1.2136e+01 
b(71) = -2.9474e+00 
b(72) = -8.2761e+00 
b(73) = -5.5199e-01 
b(74) = -1.3527e-01 
b(75) = -2.5765e-01 
b(76) = -6.1860e-01 
b(77) = 4.1224e-03 
b(78) = -3.8697e-04 
b(79) = -1.8981e+03 
b(80) = 1.5007e+03 
b(81) = 5.7488e+02 
b(82) = -1.3082e+01 
b(83) = -3.1649e-01 
b(84) = -3.6841e+03 

for the polynomial powers matrix:

     0     0     0     0     0     0
     0     0     0     0     0     1
     0     0     0     0     1     0
     0     0     0     1     0     0
     0     0     1     0     0     0
     0     1     0     0     0     0
     1     0     0     0     0     0
     0     0     0     0     0     2
     0     0     0     0     1     1
     0     0     0     0     2     0
     0     0     0     1     0     1
     0     0     0     1     1     0
     0     0     0     2     0     0
     0     0     1     0     0     1
     0     0     1     0     1     0
     0     0     1     1     0     0
     0     0     2     0     0     0
     0     1     0     0     0     1
     0     1     0     0     1     0
     0     1     0     1     0     0
     0     1     1     0     0     0
     0     2     0     0     0     0
     1     0     0     0     0     1
     1     0     0     0     1     0
     1     0     0     1     0     0
     1     0     1     0     0     0
     1     1     0     0     0     0
     2     0     0     0     0     0
     0     0     0     0     0     3
     0     0     0     0     1     2
     0     0     0     0     2     1
     0     0     0     0     3     0
     0     0     0     1     0     2
     0     0     0     1     1     1
     0     0     0     1     2     0
     0     0     0     2     0     1
     0     0     0     2     1     0
     0     0     0     3     0     0
     0     0     1     0     0     2
     0     0     1     0     1     1
     0     0     1     0     2     0
     0     0     1     1     0     1
     0     0     1     1     1     0
     0     0     1     2     0     0
     0     0     2     0     0     1
     0     0     2     0     1     0
     0     0     2     1     0     0
     0     0     3     0     0     0
     0     1     0     0     0     2
     0     1     0     0     1     1
     0     1     0     0     2     0
     0     1     0     1     0     1
     0     1     0     1     1     0
     0     1     0     2     0     0
     0     1     1     0     0     1
     0     1     1     0     1     0
     0     1     1     1     0     0
     0     1     2     0     0     0
     0     2     0     0     0     1
     0     2     0     0     1     0
     0     2     0     1     0     0
     0     2     1     0     0     0
     0     3     0     0     0     0
     1     0     0     0     0     2
     1     0     0     0     1     1
     1     0     0     0     2     0
     1     0     0     1     0     1
     1     0     0     1     1     0
     1     0     0     2     0     0
     1     0     1     0     0     1
     1     0     1     0     1     0
     1     0     1     1     0     0
     1     0     2     0     0     0
     1     1     0     0     0     1
     1     1     0     0     1     0
     1     1     0     1     0     0
     1     1     1     0     0     0
     1     2     0     0     0     0
     2     0     0     0     0     1
     2     0     0     0     1     0
     2     0     0     1     0     0
     2     0     1     0     0     0
     2     1     0     0     0     0
     3     0     0     0     0     0

and the response surface looks like:

MCResponseSurface 6dim 3deg.png

Some important statistics are:

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 6. 
Number of terms in polynomial is 84. 
Degree of response surface is 3.

MSE =			 0.02514706 
MSE DoF = 		 4951 

L-inf norm resid = 	 0.51330364 

R^2 =			 0.81903400 
adjusted R^2 =		 0.81600023 
---------------------------------------------------


Quartic Response Surface

This response surface is available for download here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_4deg.mat

For the sake of brevity, the full coefficients and powers matrix won't be printed here (they are included in the response surface file above).

The plot and relevant statistics are given here:

500px

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 6. 
Number of terms in polynomial is 210. 
Degree of response surface is 4.

MSE =			 0.02069806 
MSE DoF = 		 9790 

L-inf norm resid = 	 0.37829408 

R^2 =			 0.85452284 
adjusted R^2 =		 0.85141715 
---------------------------------------------------

It is clear that despite having a high-degree polynomial with a large number (210) of coefficients, the polynomial fit is still quite poor, and increasing the degree of the polynomial does not greatly increase the polynomial's fit to the data.

With the composite design response surface, the (reduced) third degree polynomial fit all of the data points exactly, and yielded 0 mean square error and an r-squared value of 1.0. However, this is because there were only 45 sample points, and almost as many polynomial coefficients - 37.

References

Matlab