Monte Carlo Experimental Design: Difference between revisions
From charlesreid1
| Line 60: | Line 60: | ||
</pre> | </pre> | ||
=Analysis of Results= | |||
"Multiple linear regression" is a model for one response variable ("y"), and multiple predictor variables ("X"). | |||
Linear regression information: | |||
* http://en.wikipedia.org/wiki/Linear_regression | * http://en.wikipedia.org/wiki/Linear_regression | ||
* http://www.mathworks.com/help/toolbox/stats/mvregress.html | * http://www.mathworks.com/help/toolbox/stats/mvregress.html | ||
"Multivariate linear regression" broadens multiple linear regression to account for more than one response variable ("Y"). | |||
Multivariate regression/analysis information: | |||
* http://en.wikipedia.org/wiki/Multivariate_analysis | * http://en.wikipedia.org/wiki/Multivariate_analysis | ||
The idea is that the response variables ("Y") are correlated with each other within each (independent) observation. | The idea is that the response variables ("Y") are correlated with each other within each (independent) observation. | ||
| Line 77: | Line 81: | ||
- from http://www.mathworks.de/matlabcentral/newsreader/view_thread/154512 | - from http://www.mathworks.de/matlabcentral/newsreader/view_thread/154512 | ||
Ways: | |||
* | * Matlab way | ||
* | * Matrix algebra way | ||
* http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/304252 | * http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/304252 | ||
=Constructing a Statistical Model= | =Constructing a Statistical Model= | ||
Revision as of 07:05, 28 June 2011
Overview
Monte Carlo sampling is essentially a brute-force technique in which random samples are taken until confidence that the entire space has been sampled is satisfactory.
Random numbers are used to create sampling points in each direction.
Think of Monte Carlo ray-tracing: you send out a whole bunch of rays, each in random directions, and from the result you determine the radiative flux. Mathematically, you're performing an integration by randomly sampling the function you want to integrate, then adding up all of the random samples:
$ \int f(x) dx \approx \frac{1}{N} \sum_{i} f( x_i ) $
Explanation
Transforming Variables
For a distribution that is a function of $ m $ variables $ x_1, \dots, x_m $:
Each variable has its own range, $ \alpha_i \leq x_i \leq \beta_i $
This range must be converted to $ [0,1] $
$ \hat{x}_i = \frac{ x_i - \alpha_i }{ \beta_i - \alpha_i } $
so that $ x_i \in \left[ 0, 1 \right] \forall i = 1 \dots m $
Log Scale
If you're using a log scale, i.e. sampling logarithmically more at $ \alpha $ than $ \beta $
$ \hat{x}_i = \frac{ \log{(x_i)} - \log{(\alpha_i)} }{ \log{(\beta_i)} - \log{(\alpha_i)} } $
Selecting Samples
An $ m $-element vector of random numbers is generated to correspond with a single sample.
A number of sample points are selected, and the $ \hat{x}_i $ and corresponding $ x_i $ are stored
the function is evaluated at each input variable value $ x_i $
all random input vectors and corresponding output vectors are stored/saved
Running in Parallel
Because the code suited itself to being run in parallel, I was able to use the Matlab Parallel Computing toolbox by changing the for loop over all samples that evaluates the function at each sample.
help parfor
Analysis of Results
"Multiple linear regression" is a model for one response variable ("y"), and multiple predictor variables ("X").
Linear regression information:
- http://en.wikipedia.org/wiki/Linear_regression
- http://www.mathworks.com/help/toolbox/stats/mvregress.html
"Multivariate linear regression" broadens multiple linear regression to account for more than one response variable ("Y").
Multivariate regression/analysis information:
The idea is that the response variables ("Y") are correlated with each other within each (independent) observation.
- from http://www.mathworks.de/matlabcentral/newsreader/view_thread/154512
Ways:
- Matlab way
- Matrix algebra way
- http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/304252
Constructing a Statistical Model
Classical statistical models: assume data are Gaussian, linear in structure
Aim is to use more flexible statistical-model-based tools for data analysis
Generalized linear models: Nelder and Wedderburn (1972)
Example
For details about the problem, see Example Problem for Experimental Design
For the input uncertainty map, see Example Problem for Experimental Design
What I am trying to learn:
- What does "true" function look like via MC sampling?
- How many samples are required for MC?
- How high a degree does the polynomial go to, and is that a function of the number of samples?
- How does the MC-polynomial compare to the MC-Fourier analysis?
Plan: save the state after 10, 100, 1k, 10k samples and evaluate after each.