Response Surface Methodology: Difference between revisions
From charlesreid1
| Line 39: | Line 39: | ||
* http://en.wikipedia.org/wiki/Linear_regression | * http://en.wikipedia.org/wiki/Linear_regression | ||
* http://www.mathworks.com/help/toolbox/stats/mvregress.html | * http://www.mathworks.com/help/toolbox/stats/mvregress.html | ||
* Mason Ch. 15 | |||
Generic multiple linear regression model looks like: | |||
<math> | |||
y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} + e_i \quad i=1 \dots n | |||
</math> | |||
for <math>n</math> responses (and all responses treated independently). Here, <math>y_i</math> is the <math>i^{th}</math> response from the "real" (i.e. unknown) process, which is perfectly represented using the polynomial above. | |||
It is desirable to create a surrogate model that creates as good an approximation as possible of the above polynomial. | |||
This "real", unknown polynomial can also be written (similar to ANOVA model): | |||
<math> | |||
y_i = \mu_i + e_i | |||
</math> | |||
where | |||
<math> | |||
\mu_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} | |||
</math> | |||
If each of the coefficients <math>\beta_j</math> are approximated (using linear algebra, e.g. least squares), the mean can be approximated: | |||
<math> | |||
\hat{\mu}_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} | |||
</math> | |||
and using that, the "real" polynomial responses <math>y_i</math> can be approximated with surrogate (or predicted) polynomial responses <math>\hat{y}_i</math>: | |||
<math> | |||
\hat{y}_i = \hat{\mu}_i | |||
</math> | |||
==Multivariate Linear Regression== | ==Multivariate Linear Regression== | ||
Revision as of 19:45, 30 June 2011
Overview
A brief overview of response surface methodology (RSM) is given in the Experimental Design Lecture.
RSM basically consists of fitting a polynomial surface to a multi-input, multi-output function,
$ \boldsymbol{y} = f(\boldsymbol{x}) $
Linear Model Classification
It is useful to describe various classifications of linear models to better understand how RSM fits into the "big picture".
A very helpful guide, given by Matlab, that describes and illustrates various regression analysis techniques: http://www.mathworks.com/help/toolbox/stats/bq_676m-2.html
Generalized Linear Models
"Generalized linear models" are linear models that can account for arbitrary numbers of inputs and outputs. These models assume errors are Gaussian, use statistics and create statistical models for data analysis.
General linear model information:
- http://en.wikipedia.org/wiki/Multivariate_regression_model
- Nelder and Wedderburn (1972)
Matlab functions:
- glmfit
- http://www.mathworks.com/help/toolbox/stats/glmfit.html
- Carries out the regression
- glmval
- http://www.mathworks.com/help/toolbox/stats/glmval.html
- Evaluates the value of a generalized linear model
Multiple Linear Regression
"Multiple linear regression" is a model for one response variable ("y"), and multiple predictor variables ("X").
Linear regression information:
- http://en.wikipedia.org/wiki/Linear_regression
- http://www.mathworks.com/help/toolbox/stats/mvregress.html
- Mason Ch. 15
Generic multiple linear regression model looks like:
$ y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} + e_i \quad i=1 \dots n $
for $ n $ responses (and all responses treated independently). Here, $ y_i $ is the $ i^{th} $ response from the "real" (i.e. unknown) process, which is perfectly represented using the polynomial above.
It is desirable to create a surrogate model that creates as good an approximation as possible of the above polynomial.
This "real", unknown polynomial can also be written (similar to ANOVA model):
$ y_i = \mu_i + e_i $
where
$ \mu_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} $
If each of the coefficients $ \beta_j $ are approximated (using linear algebra, e.g. least squares), the mean can be approximated:
$ \hat{\mu}_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \dots + \beta_{k} x_{ik} $
and using that, the "real" polynomial responses $ y_i $ can be approximated with surrogate (or predicted) polynomial responses $ \hat{y}_i $:
$ \hat{y}_i = \hat{\mu}_i $
Multivariate Linear Regression
"Multivariate linear regression" broadens multiple linear regression to account for more than one response variable ("Y").
Multivariate regression/analysis information:
Polynomial Models (Univariate)
Polynomial models can be used to fit a univariate function of a single input paramter, e.g. $ y(x) $
This can be done using the following Matlab functions:
- polyfit
- http://www.mathworks.com/help/techdoc/ref/polyfit.html
- Fits a polynomial of a given degree to a set of inputs x and outputs y
- polyval
- http://www.mathworks.com/help/techdoc/ref/polyval.html
- Evaluates the value of a given polynomial model at given input variable value or values
- polyconf
- http://www.mathworks.com/help/toolbox/stats/polyconf.html
- Can be used to construct confidence intervals for polynomial models
Response Surface Models (Multivariate)
Response surfaces may be created in a number of different ways:
- regress
- regstats
- SurfaceFit (Statistics toolbox)
- others???
Selecting a Model
In order to select the model that is best for the indented use, two things must be done:
First, figure out what is wanted out of the model (the selection criteria).
Second, figure out how to select the model that is the best for that criteria (the comparison metrics).
Part of the difficulty in defining goals and selection criteria is that multivariate surfaces are very difficult to visualize in higher than 2 dimensions. Various selection criteria, i.e. numerical quantities related to error, curvature, best fit, etc., should be used to determine which surface is the best for the intended use.
Pre-Selection Step: Experimental Design
Before selecting the form of the surrogate model, you must first select your experimental design. Typically the experimental design is selected to regress some particular functional form (e.g. a polynomial).
A form of the model output(s) as a function of the model input(s) is assumed in order to sample the function as few number of times as possible.
If a Monte Carlo simulation is being run, the cost is very high, but the method is very flexible - any linear model from above may be selected and fit to the data (in this case, it is useful to explore different models of different forms and degrees).
For more information on the experimental design step, see the Experimental Design Lecture.
Selection Criteria
The most obvious criteria is minimization of error $ y - \hat{y} $
$ y $ = real response
What experimental design is trying to accomplish for simulations and experiments is similar:
- Simulations are trying to make a complex function evaluation very cheap, without losing too much information
- Experiments are trying to create a model for a complex physical process
However, the end use is often different:
- Simulations are trying to determine the values of input parameters that make the model match experimental data
- Experiments are trying to optimize a process and find minima/maxima of the physical process
$ \hat{y} $ = surrogate response
Comparison Metrics
Analysis of Variance (ANOVA) Table
Contour Plots
Contour plots can be used to determine sensitivities: if the response $ y $ changes significantly in one parameter direction, it is sensitive to that parameter. If the contour shows a structure that is uniform in one parameter direction, the response is not sensitive to that parameter.
For multiple responses, a contour plot for each response can be made, infeasible regions shaded gray, and the plots overlaid to yield the feasible region.
Other Things to Look At
- Correlation between input variables
- e.g. for time and temperature: $ {t, T, tT, t^2, T^2} $
Issues
Dealing with Multimodal Variables
Sometimes, when constructing response surfaces, modal variables appear. Modal variables are variables that have multiple modes, or distinct sets of values. There are two variations of modal variables:
1 uncertainty range (sampled with N parameter values)
These types of modal variables have a single range of uncertainty assigned to them, but the values within that range of uncertainty are discrete. In order to sample the parameter within the range of uncertainty, the parameter must be sampled at distinct, discrete values.
For example, if I am using the discrete ordinates model (DOM) for radiation calculations, the DOM requires a number of ordinate directions. This is a discrete value with distinct sets of values - e.g. 3, 6, 8, 24, etc.
Each discrete value in this case composes a single range of uncertainty. Using the DOM example, that range of uncertainty would be $ [3, 24] $.
N uncertainty ranges
The other type of modal variables have several ranges of uncertainty assigned to them, with no restriction on values within that range of uncertainty being discrete or distinct. Essentially this can be thought of as a bimodal uncertainty distribution, where the two modes are distinct. Each mode can be sampled as usual, the only sticking point is that there is more than 1, and that they are distinct.
This case provides an excellent example. The variable $ \dot{m} $ is a modal variable - the two modes are 1.0 and 2.0 - but each mode also has a range of uncertainty, namely $ 5% $ each.
How to Deal
Multimodal variables can be dealt with in two ways:
Method 1: Separate Response Surfaces for Each Mode
The first way is to create a separate response surface for each distinct mode. This method works for both types of modal variables (1 uncertainty range represented by N distinct values, and N uncertainty ranges). This method is illustrated in the figures below. Each distinct mode (gray region) has its own computed response surface (blue dotted line), distinct from the response surface of the other modes.
Of course, if the variable type is 1 uncertainty range represented by N distinct values, then there is no uncertainty range for each mode, and each gray region is, in fact, a delta function. As mentioned above, this means that the input variable is eliminated as a response surface parameter.
If the variable type is N uncertainty ranges, then each uncertainty range is sampled as usual, and each response surface is constructed as usual.
Method 2: Single Response Surface (Ignore Modes)
A second way is to create a single response surface. This is typically only possible with N uncertainty ranges type of problems, because the parameter value is continuous, but it is only certain regions that are of interest. This approach is illustrated below.
Essentially, this approach does away with any special treatment of modes.
Analysis of Results and Construction of Response Surface
The ultimate reason for sampling the function is to construct a response surface, and in order to construct a response surface, some kind of generalized linear model will have to be used.
NOTE: there is a more general discussion of experiment design, of which the response surface methodology is only one of several, at the following page: