The chain rule

Many functions we encounter in normal practice are compositions of other functions. For example, the function

is the composition of two other functions

This may be a little hard to see, so we use some additional variable names to make the whole thing more clear.

So that in computing h(x) we essentially compute the sequence

What happens when we have to compute the derivative of h(x)? One option is to expand the expression and then differentiate

So that

This option may not be a reasonable one for many problems. For example, imagine doing

this way.

Fortunately, there is a handy theorem to help with the problem of computing the derivative of a function that can be written as the composition of one or more other functions.

Theorem (The chain rule) If the function g(x) is differentiable at x and f(y) is differentiable at y = g(x), then the composite function

h(x) = f(g(x))

is differentiable at x and has derivative

The good news about this theorem is that it is actually easier to apply than it looks. All you have to be able to do is to look at an expression and identify the inner and outer functions. Applied to the example above, this gives something like

u = h(x) = f(g(x))) = f(y)

In time, and with further experience, you will be able to go directly from

to

An easier and less confusing way to apply the chain rule is to imagine peeling off layers of functions and differentiating one layer at a time. For example, consider this function

The function is built up in three layers. The outer layer is the application of the sine function. The next layer in is the squaring function, and the innermost layer is the simple 1-x function. To differentiate this, we differentiate one layer at a time, getting

Here is another example.

Now that we have seen how to apply the chain rule, let us go back and try to get a sense for where it comes from.

Proof of the chain rule As always, we start from the definition.

The function we are working with is

We do know that f and g are differentiable.

From the definition we have

The trick needed to get anywhere with this is to multiply and divide by the right factor.

(1)

In order to simplify this, we introduce some new notation.

y = g(x)

This allows us to write the limit (1) as

Note that because g is differentiable g is continuous. Thus, as k goes to 0, must go to 0 as well. We can rewrite the last limit as

We now recognize each of these two limits as a derivative. By assumption, both derivatives exist, so we can write

The Derivative of ln x

Here is a fancy application of the chain rule. We have already shown from the definition of the derivative that

The natural log function, ln x, is defined to be the inverse of the exponential function. In particular,

The left hand side is an example of a function composition.

g(x) = ln x

(2)

At the same time, since f (g(x)) = x, we also have

(3)

Setting (2) equal to (3) gives

or

Homework

Section 3.5: 4, 5, 10, 11, 19, 20, 29, 30, 51, 52, 68