Experimental Design Lecture: Difference between revisions
From charlesreid1
| Line 105: | Line 105: | ||
MSE = 1.6354 | MSE = 1.6354 | ||
==Monte Carlo Sampling== | |||
Monte Carlo sampling is essentially a brute-force technique in which random samples are taken until confidence that the entire space has been sampled is satisfactory. | |||
==Latin Hypercube== | ==Latin Hypercube== | ||
Latin Hypercube is a way of sampling a space randomly, but in such a way that each dimension of the space is sampled. | |||
For example, in the following figure, one sample falls into each bin of each of the x and y dimensions: | |||
[[Image:ExpDesignLatinHypercube.png|400px]] | |||
For a domain divided into <math>n</math> bins, each bin has an equal marginal probability of <math>1/n</math> | |||
===Algorithm=== | |||
Purpose: create an experimental design with <math>n</math> runs (number of samples to be taken), and <math>s</math> input variables | |||
The result should be a Latin hypercube design that is an <math>n \times s</math> matrix denoting the variable combinations at which to sample | |||
Step 1: take <math>s</math> independent permutations of <math>n</math> integers <math>\pi_{j}(1) \dots \pi_{j}(n)</math> | |||
(note that <math>j</math> indexes the dimension of the Latin hypercube, <math>j=1 \dots s</math>, and <math>n</math> is the number of runs or experiments) | |||
Step 2: Take <math>ns</math> random numbers <math>U_{k}^{j}</math> and compute the locations of the Latin hypercube samples as: | |||
<math> | |||
x_{k}^{j} = \frac{ \pi_{j}(k) - U_{k}^{j} }{ n } | |||
</math> | |||
where <math>k = 1 \dots n</math> and <math>j = 1 \dots s</math> | |||
===Variation=== | |||
One variation is ''centered'' Latin hypercube sampling | |||
Each sample location is given by: | |||
<math> | |||
x_{k}^{i} = \frac{ \pi^{j}(k) - 0.5 }{ n } | |||
</math> | |||
where <math>k = 1 \dots n</math> indexes which experiment (or run) | |||
(this technique does not require random numbers) | |||
==Space-Filling== | ==Space-Filling== | ||
Revision as of 18:07, 20 June 2011
Overview of Experimental Design and Surrogate Models
The Problem Statement
Purpose: create a cheap representation of an expensive computer model
We're picking some input parameters, and some output variables
Normally there is a map from one to the other: the real function $ f $,
$ \boldsymbol{y} = f(\boldsymbol{x}) $
And we're creating a surrogate model $ g $,
$ \boldsymbol{y} = g(\boldsymbol{x}) $
This is sometimes called a "metamodel", because it's a model of a model
Classes of Surrogate Models
There are several classes or forms for $ g $
- Latin hypercube
- Space-filling
- Uniform
- Neural networks
- Gaussian
- Polynomials (response surface methodology)
I won't cover all, I will only cover latin hypercube, space-filling, and response surface methodologies
Surrogate Modeling
When constructing surrogate models, important to distinguish between computer surrogate modeling (metamodeling) and experimental surrogate modeling
Big difference: experiments have random errors
Basic Concepts for Experiments
Analysis of Variance tables
Basic Concepts for Metamodeling
Metamodeling: regression on data without random errors
Trying to predict true value $ f(\boldsymbol{x}) $ using surrogate model $ g(\boldsymbol{x}) $
Mean square error:
$ MSE(g) = \int_R \left( f(\boldsymbol{x}) - g(\boldsymbol{x}) \right)^2 d\boldsymbol{x} $
where R is the region in parameter space where the metamodel applies
Example
Example function:
Real function f:
function real = real_function()
% Define the domain of the real function
x=0:(pi/32):2*pi;
real = 2*x.*cos(4*pi*x);
Surrogate function f:
function surrogate = surrogate_function()
% Define the region in which the function is valid
x = 0:(pi/32):2*pi;
surrogate = 0.9931 + 1.96*(x-0.5) - 76.8838*(x-0.5).^2 - 152.0006*(x-0.5).^3 ...
+ 943.8565*(x-0.5).^4 + 1857.1427*(x-0.5).^5 - 3983.9332*(x-0.5).^6 ...
- 7780.7937*(x-0.5).^7 + 5756.3561*(x-0.5).^8 + 11147.1698*(x-0.5).^9;
Comparing the two functions:
And comparing their error:
Mean square error:
r=real_function;
s=surrogate_function;
MSE = sum( (r-s).^2 );
MSE = 1.6354
Monte Carlo Sampling
Monte Carlo sampling is essentially a brute-force technique in which random samples are taken until confidence that the entire space has been sampled is satisfactory.
Latin Hypercube
Latin Hypercube is a way of sampling a space randomly, but in such a way that each dimension of the space is sampled.
For example, in the following figure, one sample falls into each bin of each of the x and y dimensions:
For a domain divided into $ n $ bins, each bin has an equal marginal probability of $ 1/n $
Algorithm
Purpose: create an experimental design with $ n $ runs (number of samples to be taken), and $ s $ input variables
The result should be a Latin hypercube design that is an $ n \times s $ matrix denoting the variable combinations at which to sample
Step 1: take $ s $ independent permutations of $ n $ integers $ \pi_{j}(1) \dots \pi_{j}(n) $
(note that $ j $ indexes the dimension of the Latin hypercube, $ j=1 \dots s $, and $ n $ is the number of runs or experiments)
Step 2: Take $ ns $ random numbers $ U_{k}^{j} $ and compute the locations of the Latin hypercube samples as:
$ x_{k}^{j} = \frac{ \pi_{j}(k) - U_{k}^{j} }{ n } $
where $ k = 1 \dots n $ and $ j = 1 \dots s $
Variation
One variation is centered Latin hypercube sampling
Each sample location is given by:
$ x_{k}^{i} = \frac{ \pi^{j}(k) - 0.5 }{ n } $
where $ k = 1 \dots n $ indexes which experiment (or run)
(this technique does not require random numbers)
Space-Filling
Response surface
More in detail on this