diff --git a/quarto/derivatives/derivatives.qmd b/quarto/derivatives/derivatives.qmd index 8c10e94..ca67693 100644 --- a/quarto/derivatives/derivatives.qmd +++ b/quarto/derivatives/derivatives.qmd @@ -660,12 +660,20 @@ $$ There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$. -With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)$, that is: +With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is +\begin{align*} +f'(x) &= f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3)\\ + &+ f(x)/(x-4) + f(x)/(x-5), +\end{align*} -$$ -f'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4). -$$ +That is + +\begin{align*} +f'(x) &= (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5)\\ + &+ (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) \\ + &+ (x-1)(x-2)(x-3)(x-4). +\end{align*} --- @@ -680,8 +688,7 @@ $$ ### Chain rule -Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation. - +Finally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the *chain rule*. Before deriving, let's give a slight motivation through an example. Consider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose $x$ is the number of employees and $g(x)$ is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much is made by the employees. If $y$ is the amount made, then $f(y)$ is the number of widgets finished. Suppose for some reason that $f(y) = y^2.$ @@ -806,16 +813,19 @@ This is a useful rule to remember for expressions involving exponentials. Find the derivative of $\sin(x)\cos(2x)$ at $x=\pi$. -$$ -[\sin(x)\cos(2x)]'\big|_{x=\pi} = -(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} = -((-1)(1) + (0)(-0)(2)) = -1. -$$ +\begin{align*} +[\sin(x)\cos(2x)]'\big|_{x=\pi} &=(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} \\ +& =((-1)(1) + (0)(-0)(2)) = -1. +\end{align*} ##### Proof of the Chain Rule -A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. Reexpressing this as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have: +A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. +This is reexpressed as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. + + +With that in mind, we have: $$ @@ -834,11 +844,12 @@ f(g(a) + g'(a)h + \epsilon_g(h)h) - f(g(a)) \\ Rearranging: -$$ -f(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h') = -(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h = -\epsilon(h)h, -$$ +\begin{align*} +f(g(a+h)) &- f(g(a)) - f'(g(a)) g'(a) h\\ +&= f'(g(a))\epsilon_g(h)h + \epsilon_f(h')(h')\\ +&=(f'(g(a)) \epsilon_g(h) + \epsilon_f(h') (g'(a) + \epsilon_g(h)))h \\ +&=\epsilon(h)h, +\end{align*} where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$. diff --git a/quarto/derivatives/numeric_derivatives.qmd b/quarto/derivatives/numeric_derivatives.qmd index a496516..6e8b76d 100644 --- a/quarto/derivatives/numeric_derivatives.qmd +++ b/quarto/derivatives/numeric_derivatives.qmd @@ -45,7 +45,7 @@ As such there is a balancing act: * if $h$ is too small the round-off errors are problematic, - * if $h$ is too big, the approximation to the limit is not good. + * if $h$ is too big the approximation to the limit is not good. For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good. @@ -70,7 +70,7 @@ We can compare to the actual with: ```{julia} @syms x df = diff(f(x), x) -factual = N(df(c)) +factual = convert(Float64, df(c)) abs(factual - fapprox) ``` @@ -136,16 +136,16 @@ The forward derivative is found with: ```{julia} -𝒇(x) = sqrt(1 + sin(cos(x))) -𝒄, 𝒉 = pi/4, 1e-8 -fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉 +f(x) = sqrt(1 + sin(cos(x))) +c, h = pi/4, 1e-8 +fwd = (f(c+h) - f(c))/h ``` That given by `D` is: ```{julia} -ds_value = D(𝒇)(𝒄) +ds_value = D(f)(c) ds_value, fwd, ds_value - fwd ``` @@ -153,11 +153,11 @@ Finally, `SymPy` gives an exact value we use to compare: ```{julia} -𝒇𝒑 = diff(𝒇(x), x) +fp = diff(f(x), x) ``` ```{julia} -actual = N(𝒇𝒑(PI/4)) +actual = convert(Float64, fp(PI/4)) actual - ds_value, actual - fwd ``` diff --git a/quarto/derivatives/symbolic_derivatives.qmd b/quarto/derivatives/symbolic_derivatives.qmd index c3b1149..44bf119 100644 --- a/quarto/derivatives/symbolic_derivatives.qmd +++ b/quarto/derivatives/symbolic_derivatives.qmd @@ -104,7 +104,7 @@ function D(::Val{:+}, ::Val{:nary}, args, var) end ``` -The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum. +The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only using splatting to produce the sum. Subtraction must also be implemented in a similar manner, but not for the *nary* case: @@ -195,7 +195,15 @@ function D(::Val{:cos}, ::Val{:unary}, args, var) end ``` -The pattern is similar for each. The `$a′` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work. +The pattern is similar for each. The `$a′` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. + +:::{.callout-note} +Several automatic differentiation packages use a set of rules defined following an interface spelled out in the package `ChainRules.jl`. Leveraging multi-dimensional derivatives, the chain rule is the only rule needed of the sum, product, quotient and chain rules. + +::: + + +More functions could be included, but for this example the above will suffice, as now the system is ready to be put to work. ```{julia} @@ -223,5 +231,3 @@ D(D(ex₃, :x), :x) ``` The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand. - -