Monte Carlo Experimental Design
From charlesreid1
Contents
Overview
Monte Carlo sampling is essentially a brute-force technique in which random samples are taken until confidence that the entire space has been sampled is satisfactory.
Random numbers are used to create sampling points in each direction.
Think of Monte Carlo ray-tracing: you send out a whole bunch of rays, each in random directions, and from the result you determine the radiative flux. Mathematically, you're performing an integration by randomly sampling the function you want to integrate, then adding up all of the random samples:
Explanation
Transforming Variables
For a distribution that is a function of variables :
Each variable has its own range,
This range must be converted to
so that
Log Scale
If you're using a log scale, i.e. sampling logarithmically more at than
Selecting Samples
An -element vector of random numbers is generated to correspond with a single sample.
A number of sample points are selected, and the and corresponding are stored
the function is evaluated at each input variable value
all random input vectors and corresponding output vectors are stored/saved
Running in Parallel
Because the code suited itself to being run in parallel, I was able to use the Matlab Parallel Computing toolbox by changing the for loop over all samples that evaluates the function at each sample.
help parfor
Analysis of Results
Example
Problem Information
For details about the problem, including the input uncertainty map, see Example Problem for Experimental Design
What I am trying to learn:
- What does "true" function look like via MC sampling?
- How many samples are required for MC?
- How high a degree does the polynomial go to, and is that a function of the number of samples?
- How does the polynomial compare to a Fourier analysis?
Plan: save the state after 10, 100, 1k, 10k samples and evaluate after each.
Code
Computing Response Surface
See Response Surface Methodology for general information on response surface methodology.
A Note on Visualization
See Composite Experimental Design#A Note on Visualization
A Note on Coefficient and Variable Order
See Composite Experimental Design#A Note on Coefficient and Variable Order
Polynomial Powers Matrix
See Composite Experimental Design#Polynomial Powers Matrix
Quadratic Surface, 6 Dimensions
Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_2deg.mat
- contains 2 variables:
response_surface
- structure containing information about the response surface (coefficient vector is in response_surface.beta)model
- this is a matrix containing the polynomial powers of each variable (variable order given in section Composite Experimental Design#A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section Composite Experimental Design#Polynomial Powers Matrix)
A quadratic response surface was computed using all of the information from the Monte Carlo samples. There were 10,000 samples in total.
The resulting polynomial coefficient vector is:
b(1) = 175.9 b(2) = -55.47 b(3) = -59.65 b(4) = -75.2 b(5) = 1.12 b(6) = 0.1184 b(7) = -44.47 b(8) = 4.065 b(9) = -0.8079 b(10) = 5.412 b(11) = -10.95 b(12) = 25.88 b(13) = 21.01 b(14) = -0.1112 b(15) = 0.2904 b(16) = -0.3872 b(17) = -0.003535 b(18) = -0.03342 b(19) = -0.009973 b(20) = -0.01873 b(21) = 0.0003347 b(22) = -0.0003496 b(23) = 21.21 b(24) = 16.26 b(25) = 21.64 b(26) = -0.5508 b(27) = 0.0131 b(28) = -10.72
for the polynomial powers matrix:
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 0
The response surface, plotted with the non-visualized dimensions set to a constant (their means), is:
and some important statistics are:
--------------------------------------------------- Response surface summary of information: Number of variables in response surface is 6. Number of terms in polynomial is 28. Degree of response surface is 2. MSE = 0.04265621 MSE DoF = 5007 L-inf norm resid = 0.53414457 R^2 = 0.68956066 adjusted R^2 = 0.68788663 ---------------------------------------------------
Quadratic Surface, 2 Dimensions
Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_2dim_2deg.mat
- contains 2 variables:
response_surface
- structure containing information about the response surface (coefficient vector is in response_surface.beta)model
- this is a matrix containing the polynomial powers of each variable (variable order given in section Composite Experimental Design#A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section Composite Experimental Design#Polynomial Powers Matrix)
The same set of Monte Carlo samples was fit to a quadratic surface, but with 2 variables instead of 6.
The resulting polynomial coefficient vector is:
b(1) = 0.2606 b(2) = -0.01793 b(3) = 0.0367 b(4) = -0.003453 b(5) = 0.0003092 b(6) = -0.0003485
for the polynomial powers matrix:
0 0 0 1 1 0 0 2 1 1 2 0
This results in a response surface that looks similar to the 6-dimensional quadratic response surface:
The statistics show that the fit is better for the 2-dimensional surface than for the 6-dimensional surface. This, combined with the fact that he response surfaces look similar, means we can conclude that the additional dimensions are probably independent of the two visualized dimensions, or that they ave a minimal impact on the response.
--------------------------------------------------- Response surface summary of information: Number of variables in response surface is 2. Number of terms in polynomial is 6. Degree of response surface is 2. MSE = 0.04267264 MSE DoF = 5029 L-inf norm resid = 0.50344056 R^2 = 0.68807653 adjusted R^2 = 0.68776641 ---------------------------------------------------
Cubic Surface, 6 Dimensions
Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_3deg.mat
- contains 2 variables:
response_surface
- structure containing information about the response surface (coefficient vector is in response_surface.beta)model
- this is a matrix containing the polynomial powers of each variable (variable order given in section Composite Experimental Design#A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section Composite Experimental Design#Polynomial Powers Matrix)
The polynomial coefficient vector is:
b(1) = 9.4335e+04 b(2) = -7.1360e+04 b(3) = -1.3930e+04 b(4) = 2.4439e+04 b(5) = 6.3177e+01 b(6) = -7.3399e-01 b(7) = -4.7962e+04 b(8) = 2.5084e+04 b(9) = 1.2428e+04 b(10) = 9.7792e+02 b(11) = -9.2480e+03 b(12) = -5.1276e+03 b(13) = -5.7739e+03 b(14) = -1.3153e+02 b(15) = 1.5852e+01 b(16) = -1.7894e+01 b(17) = 7.7344e-01 b(18) = -1.1606e+00 b(19) = -1.8691e+00 b(20) = 5.6333e+00 b(21) = 5.4130e-02 b(22) = -3.5192e-03 b(23) = 1.7109e+03 b(24) = -1.8055e+03 b(25) = -6.1622e+03 b(26) = 9.2918e+01 b(27) = 2.2992e+00 b(28) = 2.4321e+04 b(29) = -3.2712e+03 b(30) = -2.0094e+03 b(31) = -8.4970e+02 b(32) = 6.7127e+02 b(33) = 5.4902e+02 b(34) = 1.6649e+03 b(35) = 3.9418e+01 b(36) = -8.6106e+01 b(37) = 2.8060e+03 b(38) = 8.4698e+02 b(39) = 4.0922e+01 b(40) = -4.1602e+01 b(41) = 3.6996e+01 b(42) = 2.7639e+01 b(43) = -3.6318e+01 b(44) = 1.9221e+01 b(45) = -1.3503e-01 b(46) = 4.1899e-01 b(47) = 3.5445e-02 b(48) = 3.4742e-03 b(49) = 1.0020e-01 b(50) = 9.6863e-01 b(51) = 1.0226e-01 b(52) = -1.0614e+00 b(53) = -7.4942e-01 b(54) = -6.7971e-01 b(55) = -2.5682e-02 b(56) = 1.6590e-03 b(57) = 6.0159e-04 b(58) = -3.3300e-04 b(59) = 6.4587e-04 b(60) = 4.1078e-04 b(61) = 8.1751e-04 b(62) = -4.3508e-06 b(63) = 1.0649e-05 b(64) = 1.0716e+03 b(65) = -3.1054e+02 b(66) = -9.7221e+02 b(67) = 2.0289e+03 b(68) = -9.5553e+02 b(69) = 2.5068e+02 b(70) = -1.2136e+01 b(71) = -2.9474e+00 b(72) = -8.2761e+00 b(73) = -5.5199e-01 b(74) = -1.3527e-01 b(75) = -2.5765e-01 b(76) = -6.1860e-01 b(77) = 4.1224e-03 b(78) = -3.8697e-04 b(79) = -1.8981e+03 b(80) = 1.5007e+03 b(81) = 5.7488e+02 b(82) = -1.3082e+01 b(83) = -3.1649e-01 b(84) = -3.6841e+03
for the polynomial powers matrix:
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 1 2 0 0 0 0 2 1 0 0 0 0 3 0 0 0 0 1 0 2 0 0 0 1 1 1 0 0 0 1 2 0 0 0 0 2 0 1 0 0 0 2 1 0 0 0 0 3 0 0 0 0 1 0 0 2 0 0 1 0 1 1 0 0 1 0 2 0 0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 2 0 0 0 0 2 0 0 1 0 0 2 0 1 0 0 0 2 1 0 0 0 0 3 0 0 0 0 1 0 0 0 2 0 1 0 0 1 1 0 1 0 0 2 0 0 1 0 1 0 1 0 1 0 1 1 0 0 1 0 2 0 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 1 1 0 0 0 1 2 0 0 0 0 2 0 0 0 1 0 2 0 0 1 0 0 2 0 1 0 0 0 2 1 0 0 0 0 3 0 0 0 0 1 0 0 0 0 2 1 0 0 0 1 1 1 0 0 0 2 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 2 0 0 1 0 1 0 0 1 1 0 1 0 1 0 1 0 1 1 0 0 1 0 2 0 0 0 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 1 1 1 0 0 0 1 2 0 0 0 0 2 0 0 0 0 1 2 0 0 0 1 0 2 0 0 1 0 0 2 0 1 0 0 0 2 1 0 0 0 0 3 0 0 0 0 0
and the response surface looks like:
Some important statistics are:
--------------------------------------------------- Response surface summary of information: Number of variables in response surface is 6. Number of terms in polynomial is 84. Degree of response surface is 3. MSE = 0.02514706 MSE DoF = 4951 L-inf norm resid = 0.51330364 R^2 = 0.81903400 adjusted R^2 = 0.81600023 ---------------------------------------------------
Quartic Response Surface
This response surface is available for download here: http://files.charlesmartinreid.com/ExperimentalDesign/MCResponseSurface_6dim_4deg.mat
For the sake of brevity, the full coefficients and powers matrix won't be printed here (they are included in the response surface file above).
The plot and relevant statistics are given here:
--------------------------------------------------- Response surface summary of information: Number of variables in response surface is 6. Number of terms in polynomial is 210. Degree of response surface is 4. MSE = 0.02069806 MSE DoF = 9790 L-inf norm resid = 0.37829408 R^2 = 0.85452284 adjusted R^2 = 0.85141715 ---------------------------------------------------
It is clear that despite having a high-degree polynomial with a large number (210) of coefficients, the polynomial fit is still quite poor, and increasing the degree of the polynomial does not greatly increase the polynomial's fit to the data.
With the composite design response surface, the (reduced) third degree polynomial fit all of the data points exactly, and yielded 0 mean square error and an r-squared value of 1.0. However, this is because there were only 45 sample points, and almost as many polynomial coefficients - 37.
References
Matlab
- Statistics toolbox documentation: http://www.mathworks.com/help/toolbox/stats/bq_676m-2.html
- Very helpful page on linear regression in general: goes through linear reg., multiple linear reg., multivariate linear reg., polynomial reg., multivariate polynomial reg. (response surfaces), etc.
- x2fx function documentation: http://www.mathworks.com/help/toolbox/stats/x2fx.html
- regstats function documentation: http://www.mathworks.com/help/toolbox/stats/regstats.html
- scatter3 (3D scatter plot) function documentation: http://www.mathworks.com/help/techdoc/ref/scatter3.html
- Modifying axes properties: http://www.mathworks.com/help/techdoc/ref/axes_props.html
- Creating axes graphics objects: http://www.mathworks.com/help/techdoc/ref/axes.html
- surf function documentation: http://www.mathworks.com/help/techdoc/ref/surf.html
|