You appear to be on a device with a "narrow" screen width (i.e. you are probably on a mobile phone). Due to the nature of the mathematics on this site it is best views in landscape mode. If your device is not in landscape mode many of the equations will run off the side of your device (should be able to scroll to see them) and some of the menu items will be cut off due to the narrow screen width.

Section 2-6 : Chain Rule

We’ve been using the standard chain rule for functions of one variable throughout the last couple of sections. It’s now time to extend the chain rule out to more complicated situations. Before we actually do that let’s first review the notation for the chain rule for functions of one variable.

The notation that’s probably familiar to most people is the following.

There is an alternate notation however that while probably not used much in Calculus I is more convenient at this point because it will match up with the notation that we are going to be using in this section. Here it is.

Notice that the derivative \(\frac{{dy}}{{dt}}\) really does make sense here since if we were to plug in for \(x\) then \(y\) really would be a function of \(t\). One way to remember this form of the chain rule is to note that if we think of the two derivatives on the right side as fractions the \(dx\)’s will cancel to get the same derivative on both sides.

Okay, now that we’ve got that out of the way let’s move into the more complicated chain rules that we are liable to run across in this course.

As with many topics in multivariable calculus, there are in fact many different formulas depending upon the number of variables that we’re dealing with. So, let’s start this discussion off with a function of two variables, \(z = f\left( {x,y} \right)\). From this point there are still many different possibilities that we can look at. We will be looking at two distinct cases prior to generalizing the whole idea out.

This case is analogous to the standard chain rule from Calculus I that we looked at above. In this case we are going to compute an ordinary derivative since \(z\) really would be a function of \(t\) only if we were to substitute in for \(x\) and \(y\).

So, basically what we’re doing here is differentiating \(f\) with respect to each variable in it and then multiplying each of these by the derivative of that variable with respect to \(t\). The final step is to then add all this up.

Let’s take a look at a couple of examples.

Example 1 Compute \(\displaystyle \frac{{dz}}{{dt}}\) for each of the following.

So, technically we’ve computed the derivative. However, we should probably go ahead and substitute in for \(x\) and \(y\) as well at this point since we’ve already got \(t\)’s in the derivative. Doing this gives,

Note that in this case it might actually have been easier to just substitute in for \(x\) and \(y\) in the original function and just compute the derivative as we normally would. For comparison’s sake let’s do that.

Note that sometimes, because of the significant mess of the final answer, we will only simplify the first step a little and leave the answer in terms of \(x\), \(y\), and \(t\). This is dependent upon the situation, class and instructor however so be careful about not substituting in for without first talking to your instructor.

Now, there is a special case that we should take a quick look at before moving on to the next case. Let’s suppose that we have the following situation,

In this case if we were to substitute in for \(x\) and \(y\) we would get that \(z\) is a function of \(s\) and \(t\) and so it makes sense that we would be computing partial derivatives here and that there would be two of them.

Okay, now that we’ve seen a couple of cases for the chain rule let’s see the general version of the chain rule.

Chain Rule

Suppose that \(z\) is a function of \(n\) variables, \({x_1},{x_2}, \ldots ,{x_n}\), and that each of these variables are in turn functions of \(m\) variables, \({t_1},{t_2}, \ldots ,{t_m}\). Then for any variable \({t_i}\), \(i = 1,2, \ldots ,m\) we have the following,

Wow. That’s a lot to remember. There is actually an easier way to construct all the chain rules that we’ve discussed in the section or will look at in later examples. We can build up a tree diagram that will give us the chain rule for any situation. To see how these work let’s go back and take a look at the chain rule for \(\frac{{\partial z}}{{\partial s}}\) given that \(z = f\left( {x,y} \right)\), \(x = g\left( {s,t} \right)\), \(y = h\left( {s,t} \right)\). We already know what this is, but it may help to illustrate the tree diagram if we already know the
answer. For reference here is the chain rule for this case,

We start at the top with the function itself and the branch out from that point. The first set of branches is for the variables in the function. From each of these endpoints we put down a further set of branches that gives the variables that both \(x\) and \(y\) are a function of. We connect each letter with a line and each line represents a partial derivative as shown. Note that the letter in the numerator of the partial derivative is the upper “node” of the tree and the letter in the denominator of the partial derivative is the lower “node” of the tree.

To use this to get the chain rule we start at the bottom and for each branch that ends with the variable we want to take the derivative with respect to (\(s\) in this case) we move up the tree until we hit the top multiplying the derivatives that we see along that set of branches. Once we’ve done this for each branch that ends at \(s\), we then add the results up to get the chain rule for that given situation.

Note that we don’t always put the derivatives in the tree. Some of the trees get a little large/messy and so we won’t put in the derivatives. Just remember what derivative should be on each branch and you’ll be okay without actually writing them down.

Let’s write down some chain rules.

Example 4 Use a tree diagram to write down the chain rule for the given derivatives.

So, provided we can write down the tree diagram, and these aren’t usually too bad to write down, we can do the chain rule for any set up that we might run across.

We’ve now seen how to take first derivatives of these more complicated situations, but what about higher order derivatives? How do we do those? It’s probably easiest to see how to deal with these with an example.

We will need the first derivative before we can even think about finding the second derivative so let’s get that. This situation falls into the second case that we looked at above so we don’t need a new tree diagram. Here is the first derivative.

The issue here is to correctly deal with this derivative. Since the two first order derivatives, \(\frac{{\partial f}}{{\partial x}}\) and \(\frac{{\partial f}}{{\partial y}}\), are both functions of \(x\) and \(y\) which are in turn functions of \(r\) and \(\theta \) both of these terms are products. So, the using the product rule gives the following,

We now need to determine what \(\frac{\partial }{{\partial \theta }}\left( {\frac{{\partial f}}{{\partial x}}} \right)\) and \(\frac{\partial }{{\partial \theta }}\left( {\frac{{\partial f}}{{\partial y}}} \right)\) will be. These are both chain rule problems again since both of the derivatives are functions of \(x\) and \(y\) and we want to take the derivative with respect to \(\theta \).

Before we do these let’s rewrite the first chain rule that we did above a little.

Note that all we’ve done is change the notation for the derivative a little. With the first chain rule written in this way we can think of \(\eqref{eq:eq1}\) as a formula for differentiating any function of \(x\) and \(y\) with respect to \(\theta \) provided we have \(x = r\cos \theta \) and \(y = r\sin \theta \).

This however is exactly what we need to do the two new derivatives we need above. Both of the first order partial derivatives, \(\frac{{\partial f}}{{\partial x}}\) and \(\frac{{\partial f}}{{\partial y}}\), are functions of \(x\) and \(y\) and \(x = r\cos \theta \) and \(y = r\sin \theta \) so we can use \(\eqref{eq:eq1}\) to compute these derivatives.

To do this we’ll simply replace all the f ’s in \(\eqref{eq:eq1}\) with the first order partial derivative that we want to differentiate. At that point all we need to do is a little notational work and we’ll get the formula that we’re after.

Here is the use of \(\eqref{eq:eq1}\) to compute \(\frac{\partial }{{\partial \theta }}\left( {\frac{{\partial f}}{{\partial x}}} \right)\).

The final topic in this section is a revisiting of implicit differentiation. With these forms of the chain rule implicit differentiation actually becomes a fairly simple process. Let’s start out with the implicit differentiation that we saw in a Calculus I course.

We will start with a function in the form \(F\left( {x,y} \right) = 0\) (if it’s not in this form simply move everything to one side of the equal sign to get it into this form) where \(y = y\left( x \right)\). In a Calculus I course we were then asked to compute \(\frac{{dy}}{{dx}}\) and this was often a fairly messy process. Using the chain rule from this section however we can get a nice simple formula for doing this. We’ll start by differentiating both sides with respect to \(x\). This will mean using the chain rule on the left side and the right side will, of course, differentiate to zero. Here are the
results of that.

As shown, all we need to do next is solve for \(\frac{{dy}}{{dx}}\) and we’ve now got a very nice formula to use for implicit differentiation. Note as well that in order to simplify the formula we switched back to using the subscript notation for the derivatives.

There we go. It would have taken much longer to do this using the old Calculus I way of doing this.

We can also do something similar to handle the types of implicit differentiation problems involving partial derivatives like those we saw when we first introduced partial derivatives. In these cases we will start off with a function in the form \(F\left( {x,y,z} \right) = 0\) and assume that \(z = f\left( {x,y} \right)\) and we want to find \(\frac{{\partial z}}{{\partial x}}\) and/or \(\frac{{\partial z}}{{\partial y}}\).

Let’s start by trying to find \(\frac{{\partial z}}{{\partial x}}\). We will differentiate both sides with respect to \(x\) and we’ll need to remember that we’re going to be treating \(y\) as a constant. Also, the left side will require the chain rule. Here is this derivative.

If you go back and compare these answers to those that we found the first time around you will notice that they might appear to be different. However, if you take into account the minus sign that sits in the front of our answers here you will see that they are in fact the same.