Merge pull request #111 from jverzani/v0.18

V0.18
This commit is contained in:
john verzani 2023-06-27 18:39:39 -04:00 committed by GitHub
commit 30004e02cc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 46 additions and 29 deletions

View File

@ -660,12 +660,20 @@ $$
There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$.
With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)$, that is:
With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is
\begin{align*}
f'(x) &= f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3)\\
&+ f(x)/(x-4) + f(x)/(x-5),
\end{align*}
$$
f'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4).
$$
That is
\begin{align*}
f'(x) &= (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5)\\
&+ (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) \\
&+ (x-1)(x-2)(x-3)(x-4).
\end{align*}
---
@ -680,8 +688,7 @@ $$
### Chain rule
Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation.
Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation through an example.
Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$
@ -806,16 +813,19 @@ This is a useful rule to remember for expressions involving exponentials.
Find the derivative of $\sin(x)\cos(2x)$ at $x=\pi$.
$$
[\sin(x)\cos(2x)]'\big|_{x=\pi} =
(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} =
((-1)(1) + (0)(-0)(2)) = -1.
$$
\begin{align*}
[\sin(x)\cos(2x)]'\big|_{x=\pi} &=(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} \\
& =((-1)(1) + (0)(-0)(2)) = -1.
\end{align*}
##### Proof of the Chain Rule
A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. Reexpressing this as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have:
A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$.
This is reexpressed as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$.
With that in mind, we have:
$$
@ -834,11 +844,12 @@ f(g(a) + g'(a)h + \epsilon_g(h)h) - f(g(a)) \\
Rearranging:
$$
f(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h') =
(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h =
\epsilon(h)h,
$$
\begin{align*}
f(g(a+h)) &- f(g(a)) - f'(g(a)) g'(a) h\\
&= f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h')\\
&=(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h \\
&=\epsilon(h)h,
\end{align*}
where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$.

View File

@ -45,7 +45,7 @@ As such there is a balancing act:
* if $h$ is too small the round-off errors are problematic,
* if $h$ is too big, the approximation to the limit is not good.
* if $h$ is too big the approximation to the limit is not good.
For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good.
@ -70,7 +70,7 @@ We can compare to the actual with:
```{julia}
@syms x
df = diff(f(x), x)
factual = N(df(c))
factual = convert(Float64, df(c))
abs(factual - fapprox)
```
@ -136,16 +136,16 @@ The forward derivative is found with:
```{julia}
𝒇(x) = sqrt(1 + sin(cos(x)))
𝒄, 𝒉 = pi/4, 1e-8
fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉
f(x) = sqrt(1 + sin(cos(x)))
c, h = pi/4, 1e-8
fwd = (f(c+h) - f(c))/h
```
That given by `D` is:
```{julia}
ds_value = D(𝒇)(𝒄)
ds_value = D(f)(c)
ds_value, fwd, ds_value - fwd
```
@ -153,11 +153,11 @@ Finally, `SymPy` gives an exact value we use to compare:
```{julia}
𝒇𝒑 = diff(𝒇(x), x)
fp = diff(f(x), x)
```
```{julia}
actual = N(𝒇𝒑(PI/4))
actual = convert(Float64, fp(PI/4))
actual - ds_value, actual - fwd
```

View File

@ -104,7 +104,7 @@ function D(::Val{:+}, ::Val{:nary}, args, var)
end
```
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum.
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only using splatting to produce the sum.
Subtraction must also be implemented in a similar manner, but not for the *nary* case:
@ -195,7 +195,15 @@ function D(::Val{:cos}, ::Val{:unary}, args, var)
end
```
The pattern is similar for each. The `$a` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work.
The pattern is similar for each. The `$a` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function.
:::{.callout-note}
Several automatic differentiation packages use a set of rules defined following an interface spelled out in the package `ChainRules.jl`. Leveraging multi-dimensional derivatives, the chain rule is the only rule needed of the sum, product, quotient and chain rules.
:::
More functions could be included, but for this example the above will suffice, as now the system is ready to be put to work.
```{julia}
@ -223,5 +231,3 @@ D(D(ex₃, :x), :x)
```
The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand.