Merge pull request #111 from jverzani/v0.18

V0.18
This commit is contained in:
john verzani
2023-06-27 18:39:39 -04:00
committed by GitHub
3 changed files with 46 additions and 29 deletions

View File

@@ -660,12 +660,20 @@ $$
There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$. There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$.
With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)$, that is: With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is
\begin{align*}
f'(x) &= f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3)\\
&+ f(x)/(x-4) + f(x)/(x-5),
\end{align*}
$$ That is
f'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4).
$$ \begin{align*}
f'(x) &= (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5)\\
&+ (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) \\
&+ (x-1)(x-2)(x-3)(x-4).
\end{align*}
--- ---
@@ -680,8 +688,7 @@ $$
### Chain rule ### Chain rule
Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation. Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation through an example.
Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$ Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$
@@ -806,16 +813,19 @@ This is a useful rule to remember for expressions involving exponentials.
Find the derivative of $\sin(x)\cos(2x)$ at $x=\pi$. Find the derivative of $\sin(x)\cos(2x)$ at $x=\pi$.
$$ \begin{align*}
[\sin(x)\cos(2x)]'\big|_{x=\pi} = [\sin(x)\cos(2x)]'\big|_{x=\pi} &=(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} \\
(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} = & =((-1)(1) + (0)(-0)(2)) = -1.
((-1)(1) + (0)(-0)(2)) = -1. \end{align*}
$$
##### Proof of the Chain Rule ##### Proof of the Chain Rule
A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. Reexpressing this as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have: A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$.
This is reexpressed as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$.
With that in mind, we have:
$$ $$
@@ -834,11 +844,12 @@ f(g(a) + g'(a)h + \epsilon_g(h)h) - f(g(a)) \\
Rearranging: Rearranging:
$$ \begin{align*}
f(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h') = f(g(a+h)) &- f(g(a)) - f'(g(a)) g'(a) h\\
(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h = &= f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h')\\
\epsilon(h)h, &=(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h \\
$$ &=\epsilon(h)h,
\end{align*}
where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$. where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$.

View File

@@ -45,7 +45,7 @@ As such there is a balancing act:
* if $h$ is too small the round-off errors are problematic, * if $h$ is too small the round-off errors are problematic,
* if $h$ is too big, the approximation to the limit is not good. * if $h$ is too big the approximation to the limit is not good.
For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good. For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good.
@@ -70,7 +70,7 @@ We can compare to the actual with:
```{julia} ```{julia}
@syms x @syms x
df = diff(f(x), x) df = diff(f(x), x)
factual = N(df(c)) factual = convert(Float64, df(c))
abs(factual - fapprox) abs(factual - fapprox)
``` ```
@@ -136,16 +136,16 @@ The forward derivative is found with:
```{julia} ```{julia}
𝒇(x) = sqrt(1 + sin(cos(x))) f(x) = sqrt(1 + sin(cos(x)))
𝒄, 𝒉 = pi/4, 1e-8 c, h = pi/4, 1e-8
fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉 fwd = (f(c+h) - f(c))/h
``` ```
That given by `D` is: That given by `D` is:
```{julia} ```{julia}
ds_value = D(𝒇)(𝒄) ds_value = D(f)(c)
ds_value, fwd, ds_value - fwd ds_value, fwd, ds_value - fwd
``` ```
@@ -153,11 +153,11 @@ Finally, `SymPy` gives an exact value we use to compare:
```{julia} ```{julia}
𝒇𝒑 = diff(𝒇(x), x) fp = diff(f(x), x)
``` ```
```{julia} ```{julia}
actual = N(𝒇𝒑(PI/4)) actual = convert(Float64, fp(PI/4))
actual - ds_value, actual - fwd actual - ds_value, actual - fwd
``` ```

View File

@@ -104,7 +104,7 @@ function D(::Val{:+}, ::Val{:nary}, args, var)
end end
``` ```
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum. The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only using splatting to produce the sum.
Subtraction must also be implemented in a similar manner, but not for the *nary* case: Subtraction must also be implemented in a similar manner, but not for the *nary* case:
@@ -195,7 +195,15 @@ function D(::Val{:cos}, ::Val{:unary}, args, var)
end end
``` ```
The pattern is similar for each. The `$a` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work. The pattern is similar for each. The `$a` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function.
:::{.callout-note}
Several automatic differentiation packages use a set of rules defined following an interface spelled out in the package `ChainRules.jl`. Leveraging multi-dimensional derivatives, the chain rule is the only rule needed of the sum, product, quotient and chain rules.
:::
More functions could be included, but for this example the above will suffice, as now the system is ready to be put to work.
```{julia} ```{julia}
@@ -223,5 +231,3 @@ D(D(ex₃, :x), :x)
``` ```
The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand. The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand.