Overview

Composite experimental design refers to the successive sampling of parameter space in such a way as to construct a first or second order polynomial function.

Explanation

Setting Up the Whole Design

1. Select 5 (or 3) levels for each variable. Code each level with a numerical value, typically between $-1,1$ (but can be, e.g., between $-2,2$ , see Box and Draper 1987).

2. Create variable transforms to translate between the coded levels and the actual input parameter values (see below)

3. Create the full composite design matrix

4. Parse the full factorial matrix from above

5. Parse the fractional factorial matrix from above

6. Parse the one-factor-at-a-time matrix from above

7. Sample function in the following order:

One factor at a time
Fractional factorial
Full factorial
Full composite

How Many Levels?

The question of whether to choose 3 or 5 levels depends entirely on the case.

Typically, 3-level designs are chosen for experiments where multiple levels create difficulty in experimental setup. In this case, the minimum number of levels is desirable.

However, in simulations, 5-level designs are best, because there is no significant effort on the part of the user when running with a large number of levels.

Variable Transforms

For a variable $x_{i}$ with range $\alpha _{i}\leq x_{i}\leq \beta _{i}$ ,

the transformed variable ${\hat {x}}_{i}$ has the range $-1\leq {\hat {x}}_{i}\leq +1$ for factorial design

the transformed variable ${\hat {x}}_{i}$ has the range $-2\leq {\hat {x}}_{i}\leq +2$ for composite design

Linear Variables

To transform a linear variable $x_{i}$ to the variable ${\hat {x}}_{i}\in [-1,+1]$ :

${\hat {x}}_{i}={\frac {x_{i}-\left({\frac {\beta _{i}-\alpha _{i}}{2}}+\alpha _{i}\right)}{\frac {\beta _{i}-\alpha _{i}}{2}}}$

To transform a linear variable $x_{i}$ to the variable ${\hat {x}}_{i}\in [-2,+2]$ :

${\hat {x}}_{i}={\frac {x_{i}-\left({\frac {\beta _{i}-\alpha _{i}}{2}}+\alpha _{i}\right)}{\frac {\beta _{i}-\alpha _{i}}{4}}}$

Log Variables

To transform a log variable $x_{i}$ to the variable ${\hat {x}}_{i}\in [-1,+1]$ :

${\hat {x}}_{i}={\frac {\log {(x_{i})}-\left({\frac {\log {(\beta _{i})}-\log {(\alpha _{i})}}{2}}+\log {(\alpha _{i})}\right)}{\frac {\log {(\beta _{i})}-\log {(\alpha _{i})}}{2}}}$

To transform a log variable $x_{i}$ to the variable ${\hat {x}}_{i}\in [-2,+2]$ :

${\hat {x}}_{i}={\frac {\log {(x_{i})}-\left({\frac {\log {(\beta _{i})}-\log {(\alpha _{i})}}{2}}+\log {(\alpha _{i})}\right)}{\frac {\log {(\beta _{i})}-\log {(\alpha _{i})}}{4}}}$

Full Composite Design Matrix

Full Factorial

Fractional Factorial

One Parameter At A Time

Example

Problem Information

For details about the problem, including the input uncertainty map, see Example Problem for Experimental Design

Code

Main article: Composite Experimental Design Matlab Code

Computing Response Surface

See Response Surface Methodology for general information on response surface methodology.

See Composite Experimental Design Matlab Code for the actual Matlab code used to generate the results below.

A Note on Visualization

Response surfaces are difficult to visualize if they are more than 2 dimensions. For example, imagine reducing the dimension of a 1-D function (e.g. $y=\log {(x)}$ ) by one dimension (a point).

Even worse is reducing by more than one dimension: for example, a plane described by a 2-D polynomial to a 0-D point.

For this reason, it is important to use more reliable metrics than visual inspection in order to judge how well a response surface represents the actual response.

A Note on Coefficient and Variable Order

The coefficient vector for each response surface is given below. The order of variables for the polynomials are:

${\dot {m}}$ = mass flowrate
$k(T)$ = reaction rate
$L_{mix}$ = mixing length for mixing model
$z_{1}$ = measurement location 1
$z_{2}$ = measurement location 2
$z_{3}$ = measurement location 3

Polynomial Powers Matrix

The polynomial powers matrix for an N-variable polynomial of M degrees is a $T\times N$ matrix, where T is the number of different polynomial terms that can exist for a polynomial with N variables and M degrees.

The function allVL1 (~~available for download here: http://files.charlesmartinreid.com/ExperimentalDesign/allVL1.m~~) creates this matrix of permutations. Alternatively, the regstats and x2fx Matlab functions will automatically generate their own versions of this matrix (albeit in a different order than from the allLV1 function).

My Polynomial Term Ordering

When I run the regstats function in Matlab, I always specify the form of the polynomial powers matrix using the allVL1 function by running allVL1(number_of_vars, degree_of_polynomial, '<=').

This function, in turn, sorts the matrix of polynomial powers the same way that Matlab's sortrows function would sort the rows. More info here: http://www.mathworks.com/help/techdoc/ref/sortrows.html

Each response surface is available to download below, and the model form is specified in each .mat file. Alternatively, you can download the allVL1 function and run it for yourself.

Matlab's Polynomial Term Ordering

If no polynomial powers matrix is specified when running the regstats function in Matlab, then the polynomial powers for an n-variable polynomial are ordered in the same way that Matlab's x2fx function orders them.

Documentation page for x2fx describing ordering: http://www.mathworks.com/help/toolbox/stats/x2fx.html

$x_{1}$	First order non-interaction terms
$x_{2}$
$\dots$
$x_{n}$
$x_{1}x_{2}$	Second order interaction terms
$x_{1}x_{3}$
$\dots$
$x_{1}x_{n}$
$x_{2}x_{3}$
$\dots$
$x_{n-1}x_{n}$
$x_{1}^{2}$	Second order non-interaction terms
$x_{2}^{2}$
$\dots$
$x_{n}^{2}$
$x_{1}x_{2}x_{2}$	Third order interaction terms
$x_{1}x_{2}x_{3}$
$\dots$
$x_{n-1}x_{n}x_{n}$
$x_{1}^{3}$	Third order non-interaction terms
$x_{2}^{3}$
$\dots$
$x_{n}^{3}$

Quadratic Surface, 6 Dimensions

~~Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/ResponseSurface_6dim_2deg.mat~~

~~contains 2 variables:~~
- model - this is a matrix containing the polynomial powers of each variable (variable order given in section above, #A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section above, #Polynomial Powers Matrix)

A quadratic response surface for $y_{p,exit}$ , a quadratic function of 6 input parameters of the form:

${\hat {y}}({\boldsymbol {x}})=b_{0}+\sum _{i=1}^{6}b_{i}x_{i}+\sum _{i>j}\sum _{j=1}^{6}b_{ij}x_{i}x_{j}+\sum _{i=1}^{6}b_{ii}x_{i}^{2}$

was computed using Matlab's regstats command [1].

Because the response surface is six dimensions, graphical representation is difficult (see preceding section). However, the surface was visualized using the mean values of each of the 4 non-visualized dimensions. The two dimensions visualized were $L_{mix}$ and $k(T)$ .

The resulting polynomial coefficient vector $\mathbf {b}$ is:

b(1) = 4.0870e+03 
b(2) = -2.0956e+03 
b(3) = -1.2574e+03 
b(4) = -4.1912e+02 
b(5) = -2.6527e-01 
b(6) = 8.2956e-02 
b(7) = -8.3864e+02 
b(8) = 4.1912e+02 
b(9) = 4.0102e-09 
b(10) = 4.1912e+02 
b(11) = 1.2271e-08 
b(12) = 1.0050e-08 
b(13) = 4.1912e+02 
b(14) = 1.2039e-10 
b(15) = 1.1920e-10 
b(16) = 1.1952e-10 
b(17) = 7.9500e-02 
b(18) = 1.2627e-11 
b(19) = 1.2676e-11 
b(20) = 1.2491e-11 
b(21) = 6.4480e-03 
b(22) = -9.1954e-04 
b(23) = 9.1895e-09 
b(24) = 7.8094e-09 
b(25) = 8.7553e-09 
b(26) = 1.4867e-02 
b(27) = 1.1544e-02 
b(28) = 4.1922e+02

for the polynomial powers matrix:

     0     0     0     0     0     0
     0     0     0     0     0     1
     0     0     0     0     1     0
     0     0     0     1     0     0
     0     0     1     0     0     0
     0     1     0     0     0     0
     1     0     0     0     0     0
     0     0     0     0     0     2
     0     0     0     0     1     1
     0     0     0     0     2     0
     0     0     0     1     0     1
     0     0     0     1     1     0
     0     0     0     2     0     0
     0     0     1     0     0     1
     0     0     1     0     1     0
     0     0     1     1     0     0
     0     0     2     0     0     0
     0     1     0     0     0     1
     0     1     0     0     1     0
     0     1     0     1     0     0
     0     1     1     0     0     0
     0     2     0     0     0     0
     1     0     0     0     0     1
     1     0     0     0     1     0
     1     0     0     1     0     0
     1     0     1     0     0     0
     1     1     0     0     0     0
     2     0     0     0     0     0

The resulting response surface, holding all other parameters constant at their mean value, looks like:

Some key statistics for the response surface are given here:

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 6. 
Number of terms in polynomial is 28. 
Degree of response surface is 2.
MSE =			 0.03845480 
MSE DoF = 			 17 

L-inf norm resid = 	 0.34272386 

R^2 =			 0.86371957 
adjusted R^2 =		 0.64727417 
---------------------------------------------------

Quadratic Surface, 2 Dimensions

~~Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/ResponseSurface_2dim_2deg.mat~~

~~contains 2 variables:~~
- model - this is a matrix containing the polynomial powers of each variable (variable order given in section above, #A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section above, #Polynomial Powers Matrix)

The response surface resulting from the regression of only the two dimensions visualized (of the same form, but lower in dimension) results in a polynomial coefficient vector of:

b(1) = 0.2019 
b(2) = -0.1065 
b(3) = 0.1115 
b(4) = 0.0269 
b(5) = -0.0145 
b(6) = -0.0009

for the polynomial powers matrix:

It also results in the following response surface:

This surface has the following statistics:

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 2. 
Number of terms in polynomial is 6. 
Degree of response surface is 2.
MSE =			 0.00690353 
MSE DoF = 			 39 

L-inf norm resid = 	 0.13735696 

R^2 =			 0.93490530 
adjusted R^2 =		 0.92655983 
---------------------------------------------------

It is obvious that removing the 4 non-visualized dimensions yields very significant differences in the response surface statistics.

Also of note, the 2-dimensional surface predicts a response greater than 1, physically impossible for the response of interest (mass fractions). However, this is a constraint that is not incorporated into the regression procedure.

As polynomial degrees increase, this characteristic of the response surfaces (predicting impossible or non-physical responses) becomes more exaggerated.

Cubic Surface, 6 Dimensions: Trouble in Paradise

~~Download the response surface here: http://files.charlesmartinreid.com/ExperimentalDesign/ResponseSurface_6dim_3deg.mat~~

~~contains 2 variables:~~
- model - this is a matrix containing the polynomial powers of each variable (variable order given in section above, #A Note on Coefficient and Variable Order; description of polynomial powers matrix given in section above, #Polynomial Powers Matrix)

A 6-dimensional cubic response surface has 84 coefficients - much higher than the number of sample points obtained with a composite design. However, if most 3rd order interaction terms are eliminated, and only a few are used, this will significantly reduce the number of coefficients.

A cubic model was used that was the same as the quadratic models described above, but with the addition of 9 third order terms, listed on the right.

The coefficient vector is:

b(1) = 351.8 
b(2) = -1.065e-06 
b(3) = -2.132e-07 
b(4) = -4.034e-08 
b(5) = -6.49 
b(6) = -0.1842 
b(7) = -1048 
b(8) = 4.272e-07 
b(9) = -4.456e-10 
b(10) = 1.437e-07 
b(11) = -1.3e-09 
b(12) = -5.911e-10 
b(13) = 9.099e-08 
b(14) = -9.027e-12 
b(15) = -2.151e-11 
b(16) = -3.873e-12 
b(17) = 6.377 
b(18) = -3.172e-12 
b(19) = -1.141e-12 
b(20) = -3.025e-12 
b(21) = -1.529 
b(22) = 0.3625 
b(23) = -1.38e-09 
b(24) = -8.897e-10 
b(25) = -1.015e-09 
b(26) = 0.01404 
b(27) = 0.00944 
b(28) = 1048 
b(29) = -349.3 
b(30) = 0.001737 
b(31) = -2.122 
b(32) = -6.063e-08 
b(33) = -3.193e-08 
b(34) = -5.694e-08 
b(35) = -0.003435 
b(36) = 2.63 
b(37) = -0.5728

where the order for the first 28 terms is the same as for the quadratic models above, and the remaining 9 terms are in the order given in the table to the right. This makes the polynomial powers matrix:

     0     0     0     0     0     0
     0     0     0     0     0     1
     0     0     0     0     1     0
     0     0     0     1     0     0
     0     0     1     0     0     0
     0     1     0     0     0     0
     1     0     0     0     0     0
     0     0     0     0     0     2
     0     0     0     0     1     1
     0     0     0     0     2     0
     0     0     0     1     0     1
     0     0     0     1     1     0
     0     0     0     2     0     0
     0     0     1     0     0     1
     0     0     1     0     1     0
     0     0     1     1     0     0
     0     0     2     0     0     0
     0     1     0     0     0     1
     0     1     0     0     1     0
     0     1     0     1     0     0
     0     1     1     0     0     0
     0     2     0     0     0     0
     1     0     0     0     0     1
     1     0     0     0     1     0
     1     0     0     1     0     0
     1     0     1     0     0     0
     1     1     0     0     0     0
     2     0     0     0     0     0
     3     0     0     0     0     0
     0     3     0     0     0     0
     0     0     3     0     0     0
     0     0     0     3     0     0
     0     0     0     0     3     0
     0     0     0     0     0     3
     1     1     1     0     0     0
     0     1     2     0     0     0
     0     2     1     0     0     0

Term
$x_{1}^{3}$
$x_{2}^{3}$
$x_{3}^{3}$
$x_{4}^{3}$
$x_{5}^{3}$
$x_{6}^{3}$
$x_{1}x_{2}x_{3}$
$x_{2}x_{3}^{2}$
$x_{2}^{2}x_{3}$

This model presents an interesting problem. The 6-dimensional response surface that results, plotted in 2 dimensions (again using mean values for non-visualized dimensions), looks like this:

Note the range of the response: this is clearly a fishy response surface (maximum predicted $y_{p}$ is on the order of 1000???). However, looking at the statistics shows that the polynomial creates a perfect fit!

---------------------------------------------------
Response surface summary of information:
Number of variables in response surface is 6. 
Number of terms in polynomial is 37. 
Degree of response surface is varied, deg is a matrix. Max degree = 3.
MSE =			 0.00000000 
MSE DoF = 			 8 

L-inf norm resid = 	 0.00000000 

R^2 =			 1.00000000 
adjusted R^2 =		 1.00000000 
---------------------------------------------------

At every experimental design sample point, the polynomial response prediction ${\hat {y}}$ exactly matches the actual response $y$ , resulting in 0 error.

The knee-jerk reaction is that something must be wrong - the response surface is wrong, there was some mistake, the software should have come up with a more "reasonable" response surface to fit the sample points. However, regression (and the whole idea of using response surfaces to represent complex functions) is double-edged sword: you can make the function evaluation much, much cheaper - but the price you pay is a significant loss of information.

One may suggest an alternative validation technique of creating a low-dimensional response surface, which is easier to fit with a "reasonable" or "sensible" polynomial, and perform validation; then use the feasible (validated) values for each of those dimensions to create a second low-dimensional response surface, which is then validated; this yields a new feasible set, which can be combined with the old feasible set; and so on, until all dimensions have been covered and valid ranges for all input parameter values determined.

However, this approach is not equivalent, nor is it an improvement. When creating the low-dimensional response surface, one must select values for the other, ignored dimensions; these values are uncertain and are merely guesses. Changing the values of non-regressed variables will likely result in significant changes in the regression results (i.e. the response surface).

Box-Behnken Designs

The relationship between composite and Box Behnken designs is that, if you use a face-centered (i.e. a 3-level) composite design and combine it with a Box Behnken design, you will get a full $3^{k}$ factorial design. So composite and Box Behnken designs are both fractional $3^{k}$ factorial designs.

Composite Experimental Design

From charlesreid1

Contents