commit
30004e02cc
@ -660,12 +660,20 @@ $$
|
||||
There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$.
|
||||
|
||||
|
||||
With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)$, that is:
|
||||
With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is
|
||||
|
||||
\begin{align*}
|
||||
f'(x) &= f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3)\\
|
||||
&+ f(x)/(x-4) + f(x)/(x-5),
|
||||
\end{align*}
|
||||
|
||||
$$
|
||||
f'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4).
|
||||
$$
|
||||
That is
|
||||
|
||||
\begin{align*}
|
||||
f'(x) &= (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5)\\
|
||||
&+ (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) \\
|
||||
&+ (x-1)(x-2)(x-3)(x-4).
|
||||
\end{align*}
|
||||
|
||||
---
|
||||
|
||||
@ -680,8 +688,7 @@ $$
|
||||
### Chain rule
|
||||
|
||||
|
||||
Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation.
|
||||
|
||||
Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation through an example.
|
||||
|
||||
Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$
|
||||
|
||||
@ -806,16 +813,19 @@ This is a useful rule to remember for expressions involving exponentials.
|
||||
Find the derivative of $\sin(x)\cos(2x)$ at $x=\pi$.
|
||||
|
||||
|
||||
$$
|
||||
[\sin(x)\cos(2x)]'\big|_{x=\pi} =
|
||||
(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} =
|
||||
((-1)(1) + (0)(-0)(2)) = -1.
|
||||
$$
|
||||
\begin{align*}
|
||||
[\sin(x)\cos(2x)]'\big|_{x=\pi} &=(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} \\
|
||||
& =((-1)(1) + (0)(-0)(2)) = -1.
|
||||
\end{align*}
|
||||
|
||||
##### Proof of the Chain Rule
|
||||
|
||||
|
||||
A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. Reexpressing this as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have:
|
||||
A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$.
|
||||
This is reexpressed as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$.
|
||||
|
||||
|
||||
With that in mind, we have:
|
||||
|
||||
|
||||
$$
|
||||
@ -834,11 +844,12 @@ f(g(a) + g'(a)h + \epsilon_g(h)h) - f(g(a)) \\
|
||||
Rearranging:
|
||||
|
||||
|
||||
$$
|
||||
f(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h') =
|
||||
(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h =
|
||||
\epsilon(h)h,
|
||||
$$
|
||||
\begin{align*}
|
||||
f(g(a+h)) &- f(g(a)) - f'(g(a)) g'(a) h\\
|
||||
&= f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h')\\
|
||||
&=(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h \\
|
||||
&=\epsilon(h)h,
|
||||
\end{align*}
|
||||
|
||||
where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$.
|
||||
|
||||
|
@ -45,7 +45,7 @@ As such there is a balancing act:
|
||||
|
||||
|
||||
* if $h$ is too small the round-off errors are problematic,
|
||||
* if $h$ is too big, the approximation to the limit is not good.
|
||||
* if $h$ is too big the approximation to the limit is not good.
|
||||
|
||||
|
||||
For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good.
|
||||
@ -70,7 +70,7 @@ We can compare to the actual with:
|
||||
```{julia}
|
||||
@syms x
|
||||
df = diff(f(x), x)
|
||||
factual = N(df(c))
|
||||
factual = convert(Float64, df(c))
|
||||
abs(factual - fapprox)
|
||||
```
|
||||
|
||||
@ -136,16 +136,16 @@ The forward derivative is found with:
|
||||
|
||||
|
||||
```{julia}
|
||||
𝒇(x) = sqrt(1 + sin(cos(x)))
|
||||
𝒄, 𝒉 = pi/4, 1e-8
|
||||
fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉
|
||||
f(x) = sqrt(1 + sin(cos(x)))
|
||||
c, h = pi/4, 1e-8
|
||||
fwd = (f(c+h) - f(c))/h
|
||||
```
|
||||
|
||||
That given by `D` is:
|
||||
|
||||
|
||||
```{julia}
|
||||
ds_value = D(𝒇)(𝒄)
|
||||
ds_value = D(f)(c)
|
||||
ds_value, fwd, ds_value - fwd
|
||||
```
|
||||
|
||||
@ -153,11 +153,11 @@ Finally, `SymPy` gives an exact value we use to compare:
|
||||
|
||||
|
||||
```{julia}
|
||||
𝒇𝒑 = diff(𝒇(x), x)
|
||||
fp = diff(f(x), x)
|
||||
```
|
||||
|
||||
```{julia}
|
||||
actual = N(𝒇𝒑(PI/4))
|
||||
actual = convert(Float64, fp(PI/4))
|
||||
actual - ds_value, actual - fwd
|
||||
```
|
||||
|
||||
|
@ -104,7 +104,7 @@ function D(::Val{:+}, ::Val{:nary}, args, var)
|
||||
end
|
||||
```
|
||||
|
||||
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum.
|
||||
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only using splatting to produce the sum.
|
||||
|
||||
|
||||
Subtraction must also be implemented in a similar manner, but not for the *nary* case:
|
||||
@ -195,7 +195,15 @@ function D(::Val{:cos}, ::Val{:unary}, args, var)
|
||||
end
|
||||
```
|
||||
|
||||
The pattern is similar for each. The `$a′` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work.
|
||||
The pattern is similar for each. The `$a′` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function.
|
||||
|
||||
:::{.callout-note}
|
||||
Several automatic differentiation packages use a set of rules defined following an interface spelled out in the package `ChainRules.jl`. Leveraging multi-dimensional derivatives, the chain rule is the only rule needed of the sum, product, quotient and chain rules.
|
||||
|
||||
:::
|
||||
|
||||
|
||||
More functions could be included, but for this example the above will suffice, as now the system is ready to be put to work.
|
||||
|
||||
|
||||
```{julia}
|
||||
@ -223,5 +231,3 @@ D(D(ex₃, :x), :x)
|
||||
```
|
||||
|
||||
The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand.
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user