In Mapping a derivative, we saw that we can think of a derivative as a local scaling. What happens if we now want to find the derivative of a composite of two functions?

What does the mapping diagram of the composite \(h(x)=g(f(x))\) look like if \(f(x)=x^2+3\) and \(g(x)=\sqrt{x}\) (so that \(h(x)=\sqrt{x^2+3}\)), and we centre our mapping diagram at \(1\) on the input numberline?

How does the diagram change as we zoom in?

- What does this tell us about the derivative \(h'(1)\)?

If we draw the mapping diagram at a scale of \(1\) unit per tick mark, we obtain the following, where the non-linear behaviour of \(f\) is clear, though that of \(g\) is less so:

We can now draw the composite arrows and extend them into lines:

We see that the green lines fail to meet at a point. If we zoom in, things now look very different:

What is going on here?

The composite function \(h(x)\) appears to be a local scaling, as all of the lines meet at a point. From the diagram, the scale factor appears to be \(\frac{1}{2}\).

The function \(f(x)=x^2+3\) is locally a scaling with scale factor \(2\). We can see this either by looking at the mapping diagram carefully, or by noting that the derivative of \(f(x)\) is \(f'(x)=2x\), so that the scale factor at \(x=1\) is \(f'(1)=2\).

The function \(g(x)=\sqrt{x}\) is also locally a scaling when we zoom in. The scale factor is a little harder to make out reliably from the diagram, but we can use what we know about differentiation to calculate it. We have \(g'(x)=\dfrac{1}{2\sqrt{x}}\). The function \(g\) is centred on \(x=f(1)=4\) in this diagram, so the local scale factor is \(g'(f(1))=g'(4)=\dfrac{1}{2\sqrt{4}}=\dfrac{1}{4}\).

Composing these, we obtain a local scaling with scale factor \(2\times\frac{1}{4}=\frac{1}{2}\), multiplying the scale factors as we saw in the Warm-up section. This calculated composite scale factor agrees with what we observed about the function \(h(x)\).

So we conclude that \(h'(1)=\frac{1}{2}\).

- Generalising this, given any composite function \(h(x)=g(f(x))\), how can we find \(h'(a)\) for a given \(a\)?

Here is a somewhat generic sketch of a composite mapping diagram for \(h(x)=g(f(x))\), where we have shown the arrows and lines for \(h(x)\) but not for the individual functions \(f(x)\) or \(g(x)\):

The function \(f(x)\) gives a local scaling with scale factor \(f'(a)\), the derivative of \(f\) at \(a\).

The function \(g(x)\) is centred on \(f(a)\), so it gives rise to a local scaling with scale factor \(g'(f(a))\), the derivative of \(g\) at \(f(a)\).

Since \(h(x)\) is locally the composition of these two scalings, we see that \(h'(a)\) is also a local scaling. The scale factor of the composition is the product of the scale factors of \(f\) and \(g\), so \(h'(a)=f'(a)\times g'(f(a))\).

This is the *chain rule*, which can therefore be written: \[\text{if $h(x)=g(f(x))$, then $h'(x)=g'(f(x)).f'(x)$.}\]

There is another common way of writing the chain rule, which we can get by setting \(u=f(x)\) and \(y=h(x)\), so that \(y=g(f(x))=g(u)\). We then have \(f'(x)=\dfrac{du}{dx}\) and \(g'(f(x))=g'(u)\), which we can write as \(\dfrac{dy}{du}\). Substituting this into the previous version then gives us: \[\dfrac{dy}{dx}=\dfrac{dy}{du}\times \dfrac{du}{dx}.\]