From charlesreid1

Overview

A brief overview of response surface methodology (RSM) is given in the Experimental Design Lecture.

RSM basically consists of fitting a polynomial surface to a multi-input, multi-output function,

They have the form of multivariate polynomial models.

Why Polynomials?

Maclaurin and Taylor series expansions of functions: any suitably well-behaved funciton of a mathematical variable x can be written as an infinite sum of terms involving increasing powers of x

Often, complex functions are treated by writing the polynomial Taylor expansion and truncating all but the first two or three terms

Why Not Polynomials?

Some functional forms require very high-order polynomials when a much simpler nonlinear function or a transformed parameter space would fit much better.

Another risk is in extrapolating outside of experimental region: fitted polynomial only describes response in smooth manner over region where data were obtained. Elsewhere, behavior of polynomial may be suprious/non-physical

Other Types of Linear Models

An overview of different linear models is given at the Linear Models page. This describes how RSM fits into the big picture of linear models.

Strategy for Comprehensive Regression Analysis

From Mason Ch. 14:

Plan the data collection effort

Investigate the data, calculate relevant statistics, plot the data

Specify a functional form for each variable and formulate an initial model

Estimate the model parameters, and calculate statistics that quantify the goodness-of-fit

Assess the model, assess the model assumptions, look for things like collinearities or influential observations

Select statistically significant predictor variables

Some thoughts on this comprehensive strategy:

  • It's difficult to plan data collection if you don't already have a model in mind (experimental design), unless you have a very cheap function
  • This strategy seems best suited for univariate models, or cheap experiments/function evaluations
  • When you have an expensive experiment/function, everything really hinges on the form of the assumed model.. so the fact that Mason doesn't include specify until step 3 indicates that this strategy probably doesn't apply well to those situations

An alternative proposal should follow more closely the validation strategy in the NISS paper

Steps 1-3 provide information going into the surrogate model construction process

  • Determine important variables (prior steps)
  • Specify a model form
  • Design experiments to determine function samples
  • Define selection criteria and comparison metrics
  • Calculate comparison metrics, plot data
  • Assess if the model is good enough, assess model assumptions, select if satisfied (otherwise proceed to next step)
  • Specify a new model form (that can incorporate already-gathered information!) and repeat until criteria met

Selecting a Model

In order to select the model that is best for the indented use, two things must be done:

First, figure out what is wanted out of the model (the selection criteria).

Second, figure out how to select the model that is the best for that criteria (the comparison metrics).

Part of the difficulty in defining goals and selection criteria is that multivariate surfaces are very difficult to visualize in higher than 2 dimensions. Various selection criteria, i.e. numerical quantities related to error, curvature, best fit, etc., should be used to determine which surface is the best for the intended use.

Pre-Selection Step: Experimental Design

Before selecting the form of the surrogate model, you must first select your experimental design. Typically the experimental design is selected to regress some particular functional form (e.g. a polynomial).

A form of the model output(s) as a function of the model input(s) is assumed in order to sample the function as few number of times as possible.

If a Monte Carlo simulation is being run, the cost is very high, but the method is very flexible - any linear model from above may be selected and fit to the data (in this case, it is useful to explore different models of different forms and degrees).

For more information on the experimental design step, see the Experimental Design Lecture.

Polynomial Coefficient Standardization

Standardizing the polynomial coefficients:

Transform the variables to the standardized variables :

This is desirable because interpretation of standardized coefficients are easier to interpret

Selection Criteria

The most obvious criteria is minimization of error

= real function's response

= surrogate model response

What experimental design is trying to accomplish for simulations and experiments is similar:

  • Simulations are trying to make a complex function evaluation very cheap, without losing too much information
  • Experiments are trying to create a model for a complex physical process

However, the end use is often different:

  • Simulations are trying to determine the values of input parameters that make the model match experimental data
  • Experiments are trying to optimize a process and find minima/maxima of the physical process

Least Squares for Linear Regression

Using least squares for a linear regression model approximates the coefficients of the linear model by minimizing the sum of the squared error residuals (SSE),

The estimated coefficients are then called least squares coefficient estimates

To do this, SSE equation (above) differentiated with respect to each of the parameters

All these derivatives are set equal to 0, and these equations are solved simultaneously.

Interpretation

Cannot necessarily interpret approximated coefficients as "amount of change in for a unit change in "... This assumes that the coefficients are completely independent

In order to determine how good this assumption is, regress the input/predictor variable on all other variables (i.e. find as a function of all other input/predictor variables + constant)

The residuals from this fit are

The coefficient estimate measures change in response due to unit change in , not in

If can't be predicted by other variables, then (where overline = average value)

In this case, can be interpreted in the way specified: i.e. measure of change in for unit change in

Significance

Coefficients cannot be used by themselves to determine relative significance of various terms in the linear model

To actually do this, you need to use normalized/weighted coefficients

Defined as:

where

The overline indicates a sample mean

Comparison Metrics

Analysis of Variance (ANOVA) Table

Mason ch. 6, 8

Derived in 6.1

For ANOVA of linear regression models, need to define a few quantities

Total sum of squares:

Error sum of squares:

Model sum of squares (regression sum of squares):

Univariate Linear Model

A sample ANOVA table constructed for a linear univariate function is:

Source of variation df Sum of squares Mean squares F-value p-value
Regression error 1 SSR p-value is obtained from F-statistic
Error n-2 SSE
Total n-1 TSS

Multiple Linear Model

Important not just to assess overall fit of prediction equation

Also important to assess contribution of individual predictor variables to the fit

Many commonly-reported measures of model fitness are part of ANOVA table

Multivariate models: p degrees of freedom for sum of squares due to regression (because p coefficients must be estimated to obtain regression sum of squares

A sample ANOVA table constructed for multiple linear function is:

Source of variation df Sum of squares Mean squares F-value (F-statistic)
Regression error p SSR
Error n-p-1 SSE
Total n-1 TSS

p = number of predictor variables

Measure of adequacy of fitted model: error standard deviation

where

small : predicted responses closely approximate observed responses

large : large random error, or poor selection of model form



F-Statistic

I found this short YouTube video very helpful for illustrating what the F-statistic means physically: http://www.youtube.com/watch?v=TMwSS8DAVYk

The F-statistic can be thought of as a frequentist metric for hypothesis-testing. Once an F-statistic and corresponding p-value is calculated from the ANOVA table quantities, you can determine how confident you can be in a given hypothesis test (where the hypothesis is the model onto which you've chosen to regress your data).

Mason: Different from tests for significance of factor effects in analysis of designed experiments

Mason: example of acid-content data...


For multiple linear regression/model, can use F-statistic to simultaneously test hypothesis: whether all versus the alternative, that at least one is not zero

i.e. versus

(while this seems silly, it's much more useful if you're doing this for a subset of coefficients - i.e. testing the hypothesis of whether any of a subset of coefficients should be non-zero)

Lack of Fit F-Statistic

For (deterministic) computer simulations (rather than experiments, which have random error), error is entirely due to lack of fit - not due to random error in measurements/samples

In this case, an F-statistic specifically for lack-of-fit is not possible to calculate (defined as , and for deterministic functions)

Partial F-Test (Determination of Term Significance)

Consider full regresion model with predictor variables

Now consider a reduced regression model with predictor variables

Full model =

Reduced model =

Reduction in error sum of squares resulting from fit of additional terms in full model;

more predictor variables

F-statistic for determining statistical significance of this subset is:

where

If then the F-statistic is the square of the t-statistic from the full model corresponding to the term left out of the reduced model

To determine if F-statistic is highly significant, use p value (95% likelihood of being significant if , 99% likelihood if )

Example: use to determine if interaction effect is important to surrogate model

Using this procedure and using the t-statistic to test the significance of a given term are equivalent!

T-statistic

Linear Univariate Model

A t-statistic can be constructed to test vs.

The following statistic has a Student t-distribution with n-2 degrees of freedom:

and insert

If you insert and square the result you get the F-statistic from the ANOVA table

This t-variate can be used to form confidence intervals on the slope parameter

Following Chapter 2.4: limits for are

where

is a upper-tail t critical value with n-2 degrees of freedom

Small model standard deviations will lead to small confidence intervals

For the intercept parameter , use the following t-variate:

Multiple Linear Model

Testing hypotheses on individual regression coefficients is of primary interest to someone performing regression analysis

t-statistic can be constructed to test versus

Test statistic used for this purpose is:

where

= estimated error standard deviation

= sample variance of the n values of the jth predictor variable

= coefficient of determination for regression of on the constant term and the other predictor variables

Using t-statistics with c=0 can be used to test statistical significance of of individual model parameters (usefulness of as predictor of response variable)

NOTE: this test is only conditional, since is partial regression coefficient, and are functions of other predictor variable values

Only determines significance of jth predictor variable conditional on the presence of the other predictor variables

e.g. "Each individual predictor variable contributes significantly to the given fits, given that the other two predictor variables are also included in the model"

Response Confidence Intervals

Want confidence intervals for the response model

Linear Univariate Models

Confidence interval constructed for the response model:

Mason: for fixed values of x...???

The predicted response has a normal distribution (given certain assumptions)

Thus mean and deviation given by

where

And the following t-variate can be used to construct confidence intervals for :

To form prediction interval for actual future response, not expected value of a response:

Use this equation again, but replace with , and with

= future response

= predicted value of future response

= because future response has standard deviation with an added variability

Standard deviation of is:

Multiple Linear Model

Confidence interval for regression coefficients of multiple linear model:

a confidence interval for given by

where is two-tailed t critical value having n - p - 1 degrees of freedom

Simultaneous confidence intervals for all coefficients in multiple linear regression model cannot be computed using individual coefficient intervals

They ignore systematic variation of predictor variables and consequent correlation among coefficient estimators


Correlation Coefficient (R-Squared)

Measure of correlation between observed and predicted responses

Univariate Linear Model

Multiple Linear Model

R-squared can be calculated as:

It should be acknowledged that as the number of predictor variables approaches the number of observations, this can become arbitrarily close to 1

if then

Adjusted , denoted :

where

Differences between and are minor except for when n and p are close

Caution should be used in relying on single measure of fit (e.g.


Contour Plots

Contour plots can be used to determine sensitivities: if the response changes significantly in one parameter direction, it is sensitive to that parameter. If the contour shows a structure that is uniform in one parameter direction, the response is not sensitive to that parameter.

For multiple responses, a contour plot for each response can be made, infeasible regions shaded gray, and the plots overlaid to yield the feasible region.


Determination of Outlier Data Points

Mason Ch. 18

Various test for outliers (p. 629, 631, etc.)

Tests for response variable outliers

Tests for predictor variable outliers

Other Things to Look At

  • Correlation between input variables
    • e.g. for time and temperature:

Issues

Importance of Interaction Terms

A question arises as to whether interaction terms are significant

Often, in fractional factorial designs, these interaction terms are ignored

From Mason:

"Ordinarily, one should not routinely insert products of all the predictors in a regression model. To do so might create unnecessary complications in the analysis and interpretation of the fitted models due to collinear predictor variables. The purpose of including interaction terms in regression models is to improve the fit either because theoretical considerations require a modeling of the joint effects or because an analysis of the regression data indicates that joint effects are needed in addition to the linear terms of the individual variables."


Dealing with Multimodal Variables

Sometimes, when constructing response surfaces, modal variables appear. Modal variables are variables that have multiple modes, or distinct sets of values. There are two variations of modal variables:

1 uncertainty range (sampled with N parameter values)

These types of modal variables have a single range of uncertainty assigned to them, but the values within that range of uncertainty are discrete. In order to sample the parameter within the range of uncertainty, the parameter must be sampled at distinct, discrete values.

For example, if I am using the discrete ordinates model (DOM) for radiation calculations, the DOM requires a number of ordinate directions. This is a discrete value with distinct sets of values - e.g. 3, 6, 8, 24, etc.

Each discrete value in this case composes a single range of uncertainty. Using the DOM example, that range of uncertainty would be .

N uncertainty ranges

The other type of modal variables have several ranges of uncertainty assigned to them, with no restriction on values within that range of uncertainty being discrete or distinct. Essentially this can be thought of as a bimodal uncertainty distribution, where the two modes are distinct. Each mode can be sampled as usual, the only sticking point is that there is more than 1, and that they are distinct.

This case provides an excellent example. The variable is a modal variable - the two modes are 1.0 and 2.0 - but each mode also has a range of uncertainty, namely each.

How to Deal

Multimodal variables can be dealt with in two ways:

Method 1: Separate Response Surfaces for Each Mode

The first way is to create a separate response surface for each distinct mode. This method works for both types of modal variables (1 uncertainty range represented by N distinct values, and N uncertainty ranges). This method is illustrated in the figures below. Each distinct mode (gray region) has its own computed response surface (blue dotted line), distinct from the response surface of the other modes.

Of course, if the variable type is 1 uncertainty range represented by N distinct values, then there is no uncertainty range for each mode, and each gray region is, in fact, a delta function. As mentioned above, this means that the input variable is eliminated as a response surface parameter.

If the variable type is N uncertainty ranges, then each uncertainty range is sampled as usual, and each response surface is constructed as usual.

ModalResponses1 true.png ModalResponses2 modes.png ModalResponses3 modalresponses.png
An example of a "true" response, which is unknown to the modeler. The modeler is only interested in distinct regions of the input parameter (shown in gray). The remaining regions are left out of the response surface. The response surfaces actually obtained by the user (blue dotted line). There is a separate response surface obtained by the user (2 distinct blue lines) for each mode (gray region).

Method 2: Single Response Surface (Ignore Modes)

A second way is to create a single response surface. This is typically only possible with N uncertainty ranges type of problems, because the parameter value is continuous, but it is only certain regions that are of interest. This approach is illustrated below.

Essentially, this approach does away with any special treatment of modes.

ModalResponses1 true.png ModalResponses2 modes.png ModalResponses4 fullresponse.png
(see above, image repeated for clarity) (see above, image repeated for clarity) An example of the second approach, in which the modeler constructs a single response surface, essentially ignoring the modes of the input parameter .

Analysis of Results and Construction of Response Surface

The ultimate reason for sampling the function is to construct a response surface, and in order to construct a response surface, some kind of generalized linear model will have to be used.

NOTE: there is a more general discussion of experiment design, of which the response surface methodology is only one of several, at the following page:

Experimental Design Lecture

See Also