Lesson 11: Response Surface Methods and Designs

Overview of Response Surface Methods

We are now going to shift from screening designs where the primary focus of previous lessons was factor screening – (two-level factorials, fractional factorials being widely used), to trying to optimize an underlying process and look for the factor level combinations that give us the maximum yield and minimum costs. In many applications, this is our goal. However in some cases we are trying to hit a target or aim to match some given specifications - but this brings up other issues which we will get to later.

Here the objective of Response Surface Methods (RSM) is optimization, finding the best set of factor levels to achieve some goal. This lesson aims to cover the following goals:

Learning Objectives & Outcomes

Response Surface Methodology and its sequential nature for optimizing a process

First order and second order response surface models and how to find the direction of steepest ascent (or descent) to maximize (or minimize) the response

How to deal with several responses simultaneously (Multiple Response Optimization)

Central Composite Designs (CCD) and Box-Behnken Designs as two of the major Response Surface Designs and how two generate them using Minitab

Design and Analysis of Mixture Designs for cases where the sum of the factor levels equals a constant, i.e. 100% or the totality of the components

Introductory understanding of designs for computer models

RSM dates from the 1950's. Early applications were found in the chemical industry. We have already talked about Box. Box and Draper have some wonderful references about building RSMs and analyzing them which are very useful.

RSM as a Sequential Process

The text has a graphic depicting a response surface method in three dimensions, though actually it is four dimensional space that is being represented since the three factors are in 3-dimensional space the the response is the 4th dimension.

Instead, let's look at 2 dimensions - this is easier to think about and visualize. There is a response surface and we will imagine the ideal case where there is actually a 'hill' which has a nice centered peak. (If only reality were so nice, but it usually isn't!). Consider the geologic ridges that exist here in central Pennsylvania, the optimum or highest part of the 'hill' might be anywhere along this ridge. There's no clearly defined centered high point or peak that stands out. In this case there would be a whole range of values of X1 and X2 that would all describe the same 'peak' -- actually the points lying along the top of the ridge. This type of situation is quite realistic where there does not exist a predominate optimum.

But for our purposes let's think of this ideal 'hill' and the problem is that you don't know where this is and you want to find factor level values where the response is at its peak. This is your quest, to find the values X1optimum and X2optimum, where the response is at its peak. You might have a hunch that the optimum exists in certain location. This would be good area to start - some set of conditions, perhaps the way that the factory has always been doing things - and then perform an experiment at this starting point.

The actual variables in their natural units of measurement are used in the experiment. However, when we design our experiment we will use our coded variables, X1 and X2 which will be centered on 0, and extend +1 and -1 from the center of the region of experimentation. Therefore, we will take our natural units and then center and rescale them to the range from -1 to +1.

Our goal is to start somewhere using our best prior or current knowledge and search for the optimum spot where the response is either maximized or minimized.

Here are the models that we will use.

Screening Response Model

y = β0 + β1x1 + β2x2 + β12x1x2 + ε (1)

The screening model that we used for the first order situation involves linear effects and a single cross product factor, which represents the linear x linear interaction component.

Steepest Ascent Model

If we ignore cross products which gives an indication of the curvature of the response surface that we are fitting and just look at the first order model this is called the steepest ascent model:

y = β0 + β1x1 + β2x2 + ε (2)

Optimization Model

Then, when we think that we are somewhere near the 'top of the hill' we will fit a second order model. This includes in addition the two second-order quadratic terms.

y = β0 + β1x1 + β2x2 + β12x1x2 + β11x12 + β22x22 + ε (3)

Steepest Ascent - The First Order Model

Let's look at the first order situation - the method of steepest ascent. Now, remember, in the first place we don't know if the 'hill' even exists so we will start somewhere where we think the optimum exists. We start somewhere in terms of the natural units and use the coded units to do our experiment. Consider the example 11.1 in the textbook. We want to start in the region where x1 = reaction time (30 - 40 seconds) and x2 = temperature (150 - 160 degrees), and we want to look at the yield of the process as a function of these factors. In a sense, for the purpose of illustrating this concept, we can superimpose this region of experimentation on to our plot of our unknown 'hill'. We obviously conduct the experiment in its natural units but the designs will be specified in the coded units so we can apply them to any situation.

Specifically, here we use a design with four corner points, a 22 design and five center points. We now fit this first-order model and investigate it.

We put in the actual data for A and B and the response measurements Y.

We fit the surface. The model has two main effects, one cross product term and then one additional parameter as the mean for the center point. The residuals in this case have four df which come from replication of the center points. Since there are five center points, i.e., four df among the five center points. This is a measure of pure error.

We start by testing for curvature. The question is whether the mean of the center points is different from the values at (x1,x2) = (0,0)predicted from the screening response model (main effects plus interaction). We are testing whether the mean of the points at the center are on the plane fit by the four corner points. If the p-value had been small, this would have told you that a mean of the center points is above or below the plane indicating curvature in the response surface. The fact that, in this case, it is not significant indicates there is no curvature. Indeed the center points fall exactly on the plane that fits the quarter points.

There is just one degree of freedom for this test because the design only has one additional location in terms of the x's.

Next we check for significant effects of the factors. We see from the ANOVA that there is no interaction. So, let's refit this model without the interaction term, leaving just the A and B terms. We still have the average of the center points and our AOV now shows 5 df for residual error. One of these is lack of fit of the additive model and there are 4 df of pure error as before. We have 1 df for curvature, and lack of fit in this case is just the interactions from the model.

What do we do with this? See the Minitab analysis and redo these results in EX11-1.MPJ

Our estimated model is: \(\hat{y}\) = 40.43 + 0.775x1 + 0.325x2

So, for any X1 and X2 we can predict y. This fits a flat surface and it tells us that the predicted y is a function of X1 and X2 and the coefficients are the gradient of this function. We are working in coded variables at this time so these coefficients are unitless.

If we move 0.775 in the direction of X1 and then 0.325 in the direction of X2 this is the direction of steepest ascent. All we know is that this flat surface is one side of the 'hill'.

The method of steepest ascent tells us to do a first order experiment and find the direction that the 'hill' goes up and start marching up the hill taking additional measurements at each (x1, x2) until the response starts to decrease. If we start at 0, in coded units, then we can do a series of single experiments on this path up the 'hill' of the steepest descent. If we do this at a step size of x1 = 1, then:

1 / 0.775 = x2 / 0.325 → x2 = 0.325 / 0.775 = 0.42

and thus our step size of x1 = 1 determines that x2 = 0.42, in order to move in the direction determined to be the steepest ascent. If we take steps of 1 in coded units, this would be five minutes in terms of the time units. And for each step along that path we would go up 0.42 coded units in x2 or approximately 2° on the temperature scale.

Here is the series of steps in additional meanures of five minutes and 2° temperature. The response is plotted and shows an increase that drops off towards the end.

This is a pretty smooth curve and in reality you probably should go a little bit more beyond the peak to make sure you are at the peak. But all you are trying to do is to find out approximately where the top of the 'hill' is. If your first experiment is not exactly right you might have gone off in a wrong direction!

So you might want to do another first-order experiment just to be sure. Or, you might wish to do a second order experiment, assuming you are near the top. This is what we will discuss in the next section. The second order experiment will help find a more exact location of the peak.

The point is, this is a fairly cheap way to 'scout around the mountain' to try to find where the optimum conditions are. Remember, this example is being shown in two dimensions but you may be working in three or four dimensional space! You can use the same method, fitting a first-order model and then moving up the response surface in k dimensional space until you think you are close to where the optimal conditions are.

If you are in more than 2 dimensions, you will not be able to get a nice plot. But that is OK. The method of steepest ascent tells you where to take new measurements, and you will know the response at those points. You might move a few steps and you may see that the response continued to move up or perhaps not - then you might do another first order experiment and redirect your efforts. The point is, when we do the experiment for the second order model, we hope that the optimum will be in the range of the experiment - if it is not, we are extrapolating to find the optimum. In this case, the safest thing to do is to do another experiment around this estimated optimum. Since the experiment for the second order model requires more runs than experiments for the first order model, we want to move into the right region before we start fitting second order models.

Steepest Ascent - The Second Order Model

y = β0 + β1x1 + β2x2 + β12x1x2 + β11x12 + β22x22 + ε

This second order model includes linear terms, cross product terms and a second order term for each of the x's. If we generalize this to kx's, we have k first order terms , k second order terms and then we have all possible pairwise first-order interactions. The linear terms just have one subscript. The quadratic terms have two subscripts. There are k*(k-1)/2 interaction terms. To fit this model, we are going to need a response surface design that has more runs than the first order designs used to move close to the optimum.

This second order model is the basis for response surface designs under the assumption that although the hill is not a perfect quadratic polynomial in k dimensions, it provides a good approximation to the surface near the maximum or a minimum.

Assuming that we have 'marched up this hill' and if we re-specified the region of interest in our example, we are now between 80-90 in terms of time and 170-180 in terms of temperature. We would now translate these natural units into our coded units and if we fit the first order model again, hopefully we can detect that the middle is higher than the corner points so we would have curvature in our model, and could now fit a quadratic polynomial.

After using the Steepest Ascent method to find the optimum location in terms of our factors, we can now go directly to the second order response surface design. A favorite design that we consider is sometimes referred to as a central composite design. The central compositive design is shown on Figure 11.3 above and in more detail in the text in Figure 11.10. The idea is simple - take the 2k corner points, add a center point, and then create a star by drawing a line through the center point orthogonal to each face of the hypercube. Pick a radius along this line and place a new point at that radius. The effect is that each factor is now measured at 5 levels - center, 2 corners and the 2 star points. This gives us plenty of unique treatments to fit the 2nd order model with treatment degrees of freedom left over to test the goodness of fit. Replication is still usually done only at the center point.