align fix; theorem style; condition number

2024-10-31 14:22:21 -04:00
parent 3e7e3a9727
commit 18aae2aa93
61 changed files with 1705 additions and 819 deletions
--- a/quarto/differentiable_vector_calculus/polar_coordinates.qmd
+++ b/quarto/differentiable_vector_calculus/polar_coordinates.qmd
@@ -321,10 +321,15 @@ As well, see this part of a [Wikipedia](http://en.wikipedia.org/wiki/Polar_coord

 Imagine we have $a < b$ and a partition $a=t_0 < t_1 < \cdots < t_n = b$. Let $\phi_i = (1/2)(t_{i-1} + t_{i})$ be the midpoint. Then the wedge of radius $r(\phi_i)$ with angle between $t_{i-1}$ and $t_i$ will have area $\pi r(\phi_i)^2 (t_i-t_{i-1}) / (2\pi) = (1/2) r(\phi_i)^2(t_i-t_{i-1})$, the ratio $(t_i-t_{i-1}) / (2\pi)$ being the angle to the total angle of a circle.  Summing the area of these wedges over the partition gives a Riemann sum approximation for the integral $(1/2)\int_a^b r(\theta)^2 d\theta$. This limit of this sum defines the area in polar coordinates.

+::: {.callout-note icon=false}
+## Area of polar regions

-> *Area of polar regions*. Let $R$ denote the region bounded by the curve $r(\theta)$ and bounded by the rays $\theta=a$ and $\theta=b$ with $b-a \leq 2\pi$, then the area of $R$ is given by:
->
-> $A = \frac{1}{2}\int_a^b r(\theta)^2 d\theta.$
+Let $R$ denote the region bounded by the curve $r(\theta)$ and bounded by the rays $\theta=a$ and $\theta=b$ with $b-a \leq 2\pi$, then the area of $R$ is given by:
+
+$$
+A = \frac{1}{2}\int_a^b r(\theta)^2 d\theta.
+$$
+:::



@@ -412,18 +417,19 @@ The answer is the difference:

 The length of the arc traced by a polar graph can also be expressed using an integral. Again, we partition the interval $[a,b]$ and consider the wedge from $(r(t_{i-1}), t_{i-1})$ to $(r(t_i), t_i)$. The curve this wedge approximates will have its arc length approximated by the line segment connecting the points. Expressing the points in Cartesian coordinates and simplifying gives the distance squared as:

-
+$$
 \begin{align*}
 d_i^2 &= (r(t_i) \cos(t_i) - r(t_{i-1})\cos(t_{i-1}))^2 + (r(t_i) \sin(t_i) - r(t_{i-1})\sin(t_{i-1}))^2\\
 &= r(t_i)^2 - 2r(t_i)r(t_{i-1}) \cos(t_i - t_{i-1}) +  r(t_{i-1})^2 \\
 &\approx r(t_i)^2 - 2r(t_i)r(t_{i-1}) (1 - \frac{(t_i - t_{i-1})^2}{2})+  r(t_{i-1})^2 \quad(\text{as} \cos(x) \approx 1 - x^2/2)\\
 &= (r(t_i) - r(t_{i-1}))^2 + r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2.
 \end{align*}
+$$


 As was done with arc length we multiply $d_i$ by $(t_i - t_{i-1})/(t_i - t_{i-1})$ and move the bottom factor under the square root:

-
+$$
 \begin{align*}
 d_i
 &= d_i \frac{t_i - t_{i-1}}{t_i - t_{i-1}} \\
@@ -431,13 +437,19 @@ d_i
 \frac{r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2}{(t_i - t_{i-1})^2}} \cdot (t_i - t_{i-1})\\
 &= \sqrt{(r'(\xi_i))^2 + r(t_i)r(t_{i-1})} \cdot (t_i - t_{i-1}).\quad(\text{the mean value theorem})
 \end{align*}
+$$

 Adding the approximations to the $d_i$ looks like a Riemann sum approximation to the integral $\int_a^b \sqrt{(r'(\theta)^2) + r(\theta)^2} d\theta$ (with the extension to the Riemann sum formula needed to derive the arc length for a parameterized curve). That is:

+::: {.callout-note icon=false}
+## Arc length of a polar curve

-> *Arc length of a polar curve*. The arc length of the curve described in polar coordinates by $r(\theta)$ for $a \leq \theta \leq b$ is given by:
->
-> $\int_a^b \sqrt{r'(\theta)^2 + r(\theta)^2} d\theta.$
+The arc length of the curve described in polar coordinates by $r(\theta)$ for $a \leq \theta \leq b$ is given by:
+
+$$
+\int_a^b \sqrt{r'(\theta)^2 + r(\theta)^2} d\theta.
+$$
+:::



--- a/quarto/differentiable_vector_calculus/scalar_functions.qmd
+++ b/quarto/differentiable_vector_calculus/scalar_functions.qmd
@@ -36,12 +36,13 @@ nothing

 Consider a function $f: R^n \rightarrow R$. It has multiple arguments for its input (an $x_1, x_2, \dots, x_n$) and only one, *scalar*, value for an output. Some simple examples might be:

-
+$$
 \begin{align*}
 f(x,y) &= x^2 + y^2\\
 g(x,y) &= x \cdot y\\
 h(x,y) &= \sin(x) \cdot \sin(y)
 \end{align*}
+$$

 For two examples from real life consider the elevation Point Query Service (of the [USGS](https://nationalmap.gov/epqs/)) returns the elevation in international feet or meters for a specific latitude/longitude within the United States. The longitude can be associated to an $x$ coordinate, the latitude to a $y$ coordinate, and the elevation a $z$ coordinate, and as long as the region is small enough, the $x$-$y$ coordinates can be thought to lie on a plane. (A flat earth assumption.)

@@ -631,23 +632,25 @@ Before answering this, we discuss *directional* derivatives along the simplified

 If we compose $f \circ \vec\gamma_x$, we can visualize this as a curve on the surface from $f$ that moves in the $x$-$y$ plane along the line $y=c$. The derivative of this curve will satisfy:

-
+$$
 \begin{align*}
 (f \circ \vec\gamma_x)'(x) &=
 \lim_{t \rightarrow x} \frac{(f\circ\vec\gamma_x)(t) - (f\circ\vec\gamma_x)(x)}{t-x}\\
 &= \lim_{t\rightarrow x} \frac{f(t, c) - f(x,c)}{t-x}\\
 &= \lim_{h \rightarrow 0} \frac{f(x+h, c) - f(x, c)}{h}.
 \end{align*}
+$$

 The latter expresses this to be the derivative of the function that holds the $y$ value fixed, but lets the $x$ value vary. It is the rate of change in the $x$ direction. There is special notation for this:

-
+$$
 \begin{align*}
 \frac{\partial f(x,y)}{\partial x} &=
 \lim_{h \rightarrow 0} \frac{f(x+h, y) - f(x, y)}{h},\quad\text{and analogously}\\
 \frac{\partial f(x,y)}{\partial y} &=
 \lim_{h \rightarrow 0} \frac{f(x, y+h) - f(x, y)}{h}.
 \end{align*}
+$$


 These are called the *partial* derivatives of $f$. The symbol $\partial$, read as "partial", is reminiscent of "$d$", but indicates the derivative is only in a given direction. Other notations exist for this:
@@ -685,11 +688,12 @@ Let $f(x,y) = x^2 - 2xy$, then to compute the partials, we just treat the other

 Then

-
+$$
 \begin{align*}
 \frac{\partial (x^2 - 2xy)}{\partial x} &= 2x - 2y\\
 \frac{\partial (x^2 - 2xy)}{\partial y} &= 0 - 2x = -2x.
 \end{align*}
+$$

 Combining, gives $\nabla{f} = \langle 2x -2y, -2x \rangle$.

@@ -697,12 +701,13 @@ Combining, gives $\nabla{f} = \langle 2x -2y, -2x \rangle$.
 If $g(x,y,z) = \sin(x) + z\cos(y)$, then


-
+$$
 \begin{align*}
 \frac{\partial g }{\partial x} &= \cos(x) + 0 = \cos(x),\\
 \frac{\partial g }{\partial y} &= 0 + z(-\sin(y)) = -z\sin(y),\\
 \frac{\partial g }{\partial z} &= 0 + \cos(y) = \cos(y).
 \end{align*}
+$$

 Combining, gives $\nabla{g} = \langle \cos(x), -z\sin(y), \cos(y) \rangle$.

@@ -938,12 +943,17 @@ where $\epsilon(h) \rightarrow 0$ as $h \rightarrow 0$.

 It is this characterization of differentiable that is generalized to define when a scalar function is *differentiable*.

+::: {.callout-note icon=false}
+## Differentiable

-> *Differentiable*: Let $f$ be a scalar function. Then $f$ is [differentiable](https://tinyurl.com/qj8qcbb) at a point $C$ **if** the first order partial derivatives exist at $C$ **and** for $\vec{h}$ going to $\vec{0}$:
->
-> $\|f(C + \vec{h}) - f(C) - \nabla{f}(C) \cdot \vec{h}\| = \mathcal{o}(\|\vec{h}\|),$
->
-> where $\mathcal{o}(\|\vec{h}\|)$ means that dividing the left hand side by $\|\vec{h}\|$ and taking a limit as $\vec{h}\rightarrow 0$ the limit will be $0$.
+Let $f$ be a scalar function. Then $f$ is [differentiable](https://tinyurl.com/qj8qcbb) at a point $C$ **if** the first order partial derivatives exist at $C$ **and** for $\vec{h}$ going to $\vec{0}$:
+
+$$
+\|f(C + \vec{h}) - f(C) - \nabla{f}(C) \cdot \vec{h}\| = \mathcal{o}(\|\vec{h}\|),
+$$
+
+where $\mathcal{o}(\|\vec{h}\|)$ means that dividing the left hand side by $\|\vec{h}\|$ and taking a limit as $\vec{h}\rightarrow 0$ the limit will be $0$.
+:::



@@ -962,8 +972,12 @@ Later we will see how Taylor's theorem generalizes for scalar functions and inte
 In finding a partial derivative, we restricted the surface along a curve in the $x$-$y$ plane, in this case the curve $\vec{\gamma}(t)=\langle t, c\rangle$. In general if we have a curve in the $x$-$y$ plane, $\vec{\gamma}(t)$, we can compose the scalar function $f$ with $\vec{\gamma}$ to create a univariate function. If the functions are "smooth" then this composed function should have a derivative, and some version of a "chain rule" should provide a means to compute the derivative in terms of the "derivative" of $f$ (the gradient) and the derivative of $\vec{\gamma}$ ($\vec{\gamma}'$).


-> *Chain rule*: Suppose $f$ is *differentiable* at $C$, and $\vec{\gamma}(t)$ is differentiable at $c$ with $\vec{\gamma}(c) = C$. Then $f\circ\vec{\gamma}$ is differentiable at $c$ with derivative $\nabla f(\vec{\gamma}(c)) \cdot \vec{\gamma}'(c)$.
+::: {.callout-note icon=false}
+## Chain rule

+Suppose $f$ is *differentiable* at $C$, and $\vec{\gamma}(t)$ is differentiable at $c$ with $\vec{\gamma}(c) = C$. Then $f\circ\vec{\gamma}$ is differentiable at $c$ with derivative $\nabla f(\vec{\gamma}(c)) \cdot \vec{\gamma}'(c)$.
+
+:::


 This is similar to the chain rule for univariate functions $(f\circ g)'(u) = f'(g(u)) g'(u)$ or $df/dx = df/du \cdot du/dx$. However, when we write out in components there are more terms. For example, for $n=2$ we have with $\vec{\gamma} = \langle x(t), y(t) \rangle$:
@@ -1217,7 +1231,10 @@ Let $f(x,y) = \sin(x+2y)$ and $\vec{v} = \langle 2, 1\rangle$. The directional d


 $$
-\nabla{f}\cdot \frac{\vec{v}}{\|\vec{v}\|} = \langle \cos(x + 2y), 2\cos(x + 2y)\rangle \cdot \frac{\langle 2, 1 \rangle}{\sqrt{5}} = \frac{4}{\sqrt{5}} \cos(x + 2y).
+\nabla{f}\cdot \frac{\vec{v}}{\|\vec{v}\|} =
+\langle \cos(x + 2y), 2\cos(x + 2y)\rangle \cdot
+\frac{(\langle 2, 1 \rangle)}{\sqrt{5}} =
+\frac{4}{\sqrt{5}} \cos(x + 2y).
 $$

 ##### Example
@@ -1408,17 +1425,18 @@ Let $f(x,y) = x^2 + y^2$ be a scalar function. We have if $G(r, \theta) = \langl

 Were this computed through the chain rule, we have:

-
+$$
 \begin{align*}
 \nabla G_1 &= \langle \frac{\partial r\cos(\theta)}{\partial r}, \frac{\partial r\cos(\theta)}{\partial \theta} \rangle=
 \langle \cos(\theta), -r \sin(\theta) \rangle,\\
 \nabla G_2 &= \langle \frac{\partial r\sin(\theta)}{\partial r}, \frac{\partial r\sin(\theta)}{\partial \theta}  \rangle=
 \langle \sin(\theta), r \cos(\theta) \rangle.
 \end{align*}
+$$

 We have $\partial f/\partial x = 2x$ and  $\partial f/\partial y = 2y$, which at $G$ are $2r\cos(\theta)$ and $2r\sin(\theta)$, so by the chain rule, we should have

-
+$$
 \begin{align*}
 \frac{\partial (f\circ G)}{\partial r} &=
 \frac{\partial{f}}{\partial{x}}\frac{\partial G_1}{\partial r} +
@@ -1430,6 +1448,7 @@ We have $\partial f/\partial x = 2x$ and  $\partial f/\partial y = 2y$, which at
 \frac{\partial f}{\partial y}\frac{\partial G_2}{\partial \theta} =
 2r\cos(\theta)(-r\sin(\theta)) + 2r\sin(\theta)(r\cos(\theta)) = 0.
 \end{align*}
+$$


 ## Higher order partial derivatives
@@ -1467,9 +1486,11 @@ In `SymPy` the variable to differentiate by is taken from left to right, so `dif

 We see that `diff(ex, x, y)` and `diff(ex, y, x)` are identical. This is not a coincidence, as by [Schwarz's Theorem](https://tinyurl.com/y7sfw9sx) (also known as Clairaut's theorem) this will always be the case under typical assumptions:

+::: {.callout-note icon=false}
+## Theorem on mixed partials

-> Theorem on mixed partials. If the mixed partials $\partial^2 f/\partial x \partial y$ and $\partial^2 f/\partial y \partial x$ exist and are continuous, then they are equal.
-
+If the mixed partials $\partial^2 f/\partial x \partial y$ and $\partial^2 f/\partial y \partial x$ exist and are continuous, then they are equal.
+:::


 For higher order mixed partials, something similar to Schwarz's theorem still  holds. Say $f:R^n \rightarrow R$ is $C^k$ if $f$ is continuous and all partial derivatives of order $j \leq k$ are continuous. If $f$ is $C^k$, and $k=k_1+k_2+\cdots+k_n$ ($k_i \geq 0$) then
--- a/quarto/differentiable_vector_calculus/scalar_functions_applications.qmd
+++ b/quarto/differentiable_vector_calculus/scalar_functions_applications.qmd
@@ -341,11 +341,12 @@ The level curve $f(x,y)=0$ and the level curve $g(x,y)=0$ may intersect. Solving
 To elaborate, consider two linear equations written in a general form:


-
+$$
 \begin{align*}
 ax + by &= u\\
 cx + dy &= v
 \end{align*}
+$$


 A method to solve this by hand would be to solve for $y$ from one equation, replace this expression into the second equation and then solve for $x$. From there, $y$ can be found. A more advanced method expresses the problem in a matrix formulation of the form $Mx=b$ and solves that equation. This form of solving is implemented in `Julia`, through the "backslash" operator. Here is the general solution:
@@ -422,21 +423,23 @@ We look to find the intersection point near $(1,1)$ using Newton's method
 We have by linearization:


-
+$$
 \begin{align*}
 f(x,y) &\approx f(x_n, y_n)  + \frac{\partial f}{\partial x}\Delta x + \frac{\partial f}{\partial y}\Delta y \\
 g(x,y) &\approx g(x_n, y_n)  + \frac{\partial g}{\partial x}\Delta x + \frac{\partial g}{\partial y}\Delta y,
 \end{align*}
+$$


 where $\Delta x = x- x_n$ and $\Delta y = y-y_n$. Setting $f(x,y)=0$ and $g(x,y)=0$, leaves these two linear equations in $\Delta x$ and $\Delta y$:


-
+$$
 \begin{align*}
 \frac{\partial f}{\partial x} \Delta x + \frac{\partial f}{\partial y} \Delta y &= -f(x_n, y_n)\\
 \frac{\partial g}{\partial x} \Delta x + \frac{\partial g}{\partial y} \Delta y &= -g(x_n, y_n).
 \end{align*}
+$$


 One step of Newton's method defines $(x_{n+1}, y_{n+1})$ to be the values $(x,y)$ that make the linearized functions about $(x_n, y_n)$ both equal to $\vec{0}$.
@@ -679,14 +682,18 @@ An *absolute* maximum over $U$, should it exist, would be $f(\vec{a})$ if there

 The difference is the same as the one-dimensional case: local is a statement about nearby points only, absolute a statement about all the points in the specified set.

+::: {.callout-note icon=false}
+## The [Extreme Value Theorem](https://tinyurl.com/yyhgxu8y)

-> The [Extreme Value Theorem](https://tinyurl.com/yyhgxu8y) Let $f:R^n \rightarrow R$ be continuous and defined on *closed* set $V$. Then $f$ has a minimum value $m$ and maximum value $M$ over $V$ and there exists at least two points $\vec{a}$ and $\vec{b}$ with $m = f(\vec{a})$ and $M = f(\vec{b})$.
+Let $f:R^n \rightarrow R$ be continuous and defined on *closed* set $V$. Then $f$ has a minimum value $m$ and maximum value $M$ over $V$ and there exists at least two points $\vec{a}$ and $\vec{b}$ with $m = f(\vec{a})$ and $M = f(\vec{b})$.
+:::

+::: {.callout-note icon=false}
+## [Fermat](https://tinyurl.com/nfgz8fz)'s theorem on critical points

+Let $f:R^n \rightarrow R$ be a continuous function defined on an *open* set $U$. If $x \in U$ is a point where $f$ has a local extrema *and* $f$ is differentiable, then the gradient of $f$ at $x$ is $\vec{0}$.

-> [Fermat](https://tinyurl.com/nfgz8fz)'s theorem on critical points. Let $f:R^n \rightarrow R$ be a continuous function defined on an *open* set $U$. If $x \in U$ is a point where $f$ has a local extrema *and* $f$ is differentiable, then the gradient of $f$ at $x$ is $\vec{0}$.
-
-
+:::

 Call a point in the domain of $f$ where the function is differentiable and the  gradient is zero a *stationary point* and a point in the domain where the function is either not differentiable or is a stationary point a *critical point*. The local extrema can only happen at critical points by Fermat.

@@ -735,16 +742,16 @@ To identify these through formulas, and not graphically, we could try and use th

 The generalization of the *second* derivative test is more concrete though. Recall, the second derivative test is about the concavity of the function at the critical point. When the concavity can be determined as non-zero, the test is conclusive; when the concavity is zero, the test is not conclusive. Similarly here:

+::: {.callout-note icon=false}
+## The [second](https://en.wikipedia.org/wiki/Second_partial_derivative_test) Partial Derivative Test for $f:R^2 \rightarrow R$.

-> The [second](https://en.wikipedia.org/wiki/Second_partial_derivative_test) Partial Derivative Test for $f:R^2 \rightarrow R$.
->
-> Assume the first and second partial derivatives of $f$ are defined and continuous; $\vec{a}$ be a critical point of $f$; $H$ is the hessian matrix, $[f_{xx}\quad f_{xy};f_{xy}\quad f_{yy}]$, and $d = \det(H) = f_{xx} f_{yy} - f_{xy}^2$ is the determinant of the Hessian matrix. Then:
->
->   * The function $f$ has a local minimum at $\vec{a}$ if $f_{xx} > 0$ *and* $d>0$,
->   * The function $f$ has a local maximum at $\vec{a}$ if $f_{xx} < 0$ *and* $d>0$,
->   * The function $f$ has a saddle point at $\vec{a}$ if $d < 0$,
->   * Nothing can be said if $d=0$.
+Assume the first and second partial derivatives of $f$ are defined and continuous; $\vec{a}$ be a critical point of $f$; $H$ is the hessian matrix, $[f_{xx}\quad f_{xy};f_{xy}\quad f_{yy}]$, and $d = \det(H) = f_{xx} f_{yy} - f_{xy}^2$ is the determinant of the Hessian matrix. Then:

+  * The function $f$ has a local minimum at $\vec{a}$ if $f_{xx} > 0$ *and* $d>0$,
+  * The function $f$ has a local maximum at $\vec{a}$ if $f_{xx} < 0$ *and* $d>0$,
+  * The function $f$ has a saddle point at $\vec{a}$ if $d < 0$,
+  * Nothing can be said if $d=0$.
+:::


 ---
@@ -1069,11 +1076,12 @@ $$

 Another might be the vertical squared distance to the line:

-
+$$
 \begin{align*}
 d2(\alpha, \beta) &= (y_1 - l(x_1))^2 + (y_2 - l(x_2))^2 + (y_3 - l(x_3))^2 \\
 &= (y1 - (\alpha + \beta x_1))^2 + (y2 - (\alpha + \beta x_2))^2 + (y3 - (\alpha + \beta x_3))^2
 \end{align*}
+$$

 Another might be the *shortest* distance to the line:

@@ -1407,18 +1415,22 @@ contour!(xs, ys, f,  levels = [.7, .85, 1, 1.15, 1.3])

 We can still identify the tangent and normal directions. What is different about this point is that local movement on the constraint curve is also local movement on the contour line of $f$, so $f$ doesn't increase or decrease here, as it would if this point were an extrema along the constraint. The key to seeing this is the contour lines of $f$ are *tangent* to the constraint. The respective gradients are *orthogonal* to their tangent lines, and in dimension $2$, this implies they are parallel to each other.

+::: {.callout-note icon=false}
+## The method of Lagrange multipliers

-> *The method of Lagrange multipliers*: To optimize $f(x,y)$ subject to a constraint $g(x,y) = k$ we solve for all *simultaneous* solutions to
->
->
-> \begin{align*}
-> \nabla{f}(x,y) &= \lambda \nabla{g}(x,y), \text{and}\\
-> g(x,y) &= k.
-> \end{align*}
->
->
-> These *possible* points are evaluated to see if they are maxima or minima.
+To optimize $f(x,y)$ subject to a constraint $g(x,y) = k$ we solve for all *simultaneous* solutions to

+$$
+\begin{align*}
+\nabla{f}(x,y) &= \lambda \nabla{g}(x,y), \text{and}\\
+g(x,y) &= k.
+\end{align*}
+$$
+
+
+These *possible* points are evaluated to see if they are maxima or minima.
+
+:::


 The method will not work if $\nabla{g} = \vec{0}$ or if $f$ and $g$ are not differentiable.
@@ -1472,12 +1484,13 @@ $$
 The we have


-
+$$
 \begin{align*}
 \frac{\partial L}{\partial{x}} &= \frac{\partial{f}}{\partial{x}} - \lambda \frac{\partial{g}}{\partial{x}}\\
 \frac{\partial L}{\partial{y}} &= \frac{\partial{f}}{\partial{y}} - \lambda \frac{\partial{g}}{\partial{y}}\\
 \frac{\partial L}{\partial{\lambda}} &= 0 + (g(x,y)  -  k).
 \end{align*}
+$$


 But if the Lagrange condition holds, each term is $0$, so Lagrange's method can be seen as solving for point $\nabla{L} = \vec{0}$. The optimization problem in two variables with a constraint becomes a problem of finding and classifying  zeros of a function with *three* variables.
@@ -1556,13 +1569,14 @@ The starting point is a *perturbation*: $\hat{y}(x) = y(x) + \epsilon_1 \eta_1(x
 With this notation, and fixing $y$ we can re-express the equations in terms of $\epsilon_1$ and $\epsilon_2$:


-
+$$
 \begin{align*}
 F(\epsilon_1, \epsilon_2) &= \int f(x, \hat{y}, \hat{y}') dx =
 \int f(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsilon_2 \eta_2') dx,\\
 G(\epsilon_1, \epsilon_2) &= \int g(x, \hat{y}, \hat{y}') dx =
 \int g(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsilon_2 \eta_2') dx.
 \end{align*}
+$$


 Then our problem is restated as:
@@ -1590,7 +1604,7 @@ $$
 Computing just the first one, we have using the chain rule and assuming interchanging the derivative and integral is possible:


-
+$$
 \begin{align*}
 \frac{\partial{F}}{\partial{\epsilon_1}}
 &= \int \frac{\partial}{\partial{\epsilon_1}}(
@@ -1598,6 +1612,7 @@ f(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsil
 &= \int \left(\frac{\partial{f}}{\partial{y}} \eta_1 + \frac{\partial{f}}{\partial{y'}} \eta_1'\right) dx\quad\quad(\text{from }\nabla{f} \cdot \langle 0, \eta_1, \eta_1'\rangle)\\
 &=\int \eta_1 \left(\frac{\partial{f}}{\partial{y}} - \frac{d}{dx}\frac{\partial{f}}{\partial{y'}}\right) dx.
 \end{align*}
+$$


 The last line by integration by parts:
@@ -1664,11 +1679,12 @@ ex2 = Eq(ex1.lhs()^2 - 1, simplify(ex1.rhs()^2) - 1)
 Now $y'$ can be integrated using the substitution $y - C = \lambda \cos\theta$ to give: $-\lambda\int\cos\theta d\theta = x + D$, $D$ some constant. That is:


-
+$$
 \begin{align*}
 x + D &=  - \lambda \sin\theta\\
 y - C &= \lambda\cos\theta.
 \end{align*}
+$$


 Squaring gives the equation of a circle: $(x +D)^2 + (y-C)^2 = \lambda^2$.
@@ -1680,11 +1696,12 @@ We center and *rescale* the problem so that $x_0 = -1, x_1 = 1$. Then $L > 2$ as
 We have $y=0$ at $x=1$ and $-1$ giving:


-
+$$
 \begin{align*}
 (-1 + D)^2 + (0 - C)^2 &= \lambda^2\\
 (+1 + D)^2 + (0 - C)^2 &= \lambda^2.
 \end{align*}
+$$


 Squaring out and solving gives $D=0$, $1 + C^2 = \lambda^2$. That is, an arc of circle with radius $\sqrt{1+C^2}$ and centered at $(0, C)$.
@@ -1776,7 +1793,7 @@ where $R_k(x) = f^{k+1}(\xi)/(k+1)!(x-a)^{k+1}$ for some $\xi$ between $a$ and $
 This theorem can be generalized to scalar functions, but the notation can be cumbersome. Following [Folland](https://sites.math.washington.edu/~folland/Math425/taylor2.pdf) we use *multi-index* notation. Suppose $f:R^n \rightarrow R$, and let $\alpha=(\alpha_1, \alpha_2, \dots, \alpha_n)$. Then define the following notation:


-
+$$
 \begin{align*}
 |\alpha| &= \alpha_1 + \cdots + \alpha_n, \\
 \alpha! &= \alpha_1!\alpha_2!\cdot\cdots\cdot\alpha_n!, \\
@@ -1784,6 +1801,7 @@ This theorem can be generalized to scalar functions, but the notation can be cum
 \partial^\alpha f &= \partial_1^{\alpha_1}\partial_2^{\alpha_2}\cdots \partial_n^{\alpha_n} f \\
 & = \frac{\partial^{|\alpha|}f}{\partial x_1^{\alpha_1} \partial x_2^{\alpha_2} \cdots \partial x_n^{\alpha_n}}.
 \end{align*}
+$$


 This notation makes many formulas from one dimension carry over to higher dimensions. For example, the binomial theorem says:
@@ -1800,8 +1818,8 @@ $$
 (x_1 + x_2 + \cdots + x_n)^n = \sum_{|\alpha|=k} \frac{k!}{\alpha!} \vec{x}^\alpha.
 $$

-Taylor's theorem then becomes:
-
+::: {.callout-note icon=false}
+## Taylor's theorem using multi-index

 If $f: R^n \rightarrow R$ is sufficiently smooth ($C^{k+1}$) on an open convex set $S$ about $\vec{a}$ then if $\vec{a}$ and $\vec{a}+\vec{h}$ are in $S$,

@@ -1812,18 +1830,20 @@ $$

 where $R_{\vec{a},k} = \sum_{|\alpha|=k+1}\partial^\alpha \frac{f(\vec{a} + c\vec{h})}{\alpha!} \vec{h}^\alpha$ for some $c$ in $(0,1)$.

+:::

 ##### Example


 The elegant notation masks what can be complicated expressions. Consider the simple case $f:R^2 \rightarrow R$ and $k=2$. Then this says:

-
+$$
 \begin{align*}
 f(x + dx, y+dy) &= f(x, y) + \frac{\partial f}{\partial x} dx + \frac{\partial f}{\partial y} dy \\
 &+ \frac{\partial^2 f}{\partial x^2} \frac{dx^2}{2} +  2\frac{\partial^2 f}{\partial x\partial y} \frac{dx dy}{2}\\
 &+ \frac{\partial^2 f}{\partial y^2} \frac{dy^2}{2} + R_{\langle x, y \rangle, k}(\langle dx, dy \rangle).
 \end{align*}
+$$

 Using $\nabla$ and $H$ for the Hessian and $\vec{x} = \langle x, y \rangle$ and $d\vec{x} = \langle dx, dy \rangle$, this can be expressed as:

--- a/quarto/differentiable_vector_calculus/vector_fields.qmd
+++ b/quarto/differentiable_vector_calculus/vector_fields.qmd
@@ -191,11 +191,12 @@ surface(unzip(Phi.(thetas, phis'))...)
 The partial derivatives of each component, $\partial{\Phi}/\partial{\theta}$ and $\partial{\Phi}/\partial{\phi}$, can be computed directly:


-
+$$
 \begin{align*}
 \partial{\Phi}/\partial{\theta} &= \langle -\sin(\phi)\sin(\theta), \sin(\phi)\cos(\theta),0 \rangle,\\
 \partial{\Phi}/\partial{\phi} &= \langle \cos(\phi)\cos(\theta), \cos(\phi)\sin(\theta), -\sin(\phi) \rangle.
 \end{align*}
+$$


 Using `SymPy`, we can compute through:
@@ -359,7 +360,7 @@ where $\epsilon(h) \rightarrow \vec{0}$ as $h \rightarrow \vec{0}$.
 We have, using this for *both* $F$ and $G$:


-
+$$
 \begin{align*}
 F(G(a + \vec{h})) - F(G(a)) &=
 F(G(a) + (dG_a \cdot \vec{h} + \epsilon_G \vec{h})) - F(G(a))\\
@@ -367,18 +368,20 @@ F(G(a) + (dG_a \cdot \vec{h} + \epsilon_G \vec{h})) - F(G(a))\\
 &+ \quad\epsilon_F (dG_a \cdot \vec{h} + \epsilon_G \vec{h}) - F(G(a))\\
 &= dF_{G(a)} \cdot (dG_a \cdot \vec{h})  +  dF_{G(a)} \cdot (\epsilon_G \vec{h}) + \epsilon_F (dG_a \cdot \vec{h}) + (\epsilon_F \cdot \epsilon_G\vec{h})
 \end{align*}
+$$


 The last line uses the linearity of $dF$ to isolate $dF_{G(a)} \cdot (dG_a \cdot \vec{h})$. Factoring out $\vec{h}$ and taking norms gives:


-
+$$
 \begin{align*}
 \frac{\| F(G(a+\vec{h})) - F(G(a)) - dF_{G(a)}dG_a \cdot \vec{h} \|}{\| \vec{h} \|} &=
 \frac{\|  dF_{G(a)}\cdot(\epsilon_G\vec{h}) + \epsilon_F (dG_a\cdot \vec{h}) + (\epsilon_F\cdot\epsilon_G\vec{h}) \|}{\| \vec{h} \|} \\
 &\leq \|  dF_{G(a)}\cdot\epsilon_G + \epsilon_F (dG_a) + \epsilon_F\cdot\epsilon_G \|\frac{\|\vec{h}\|}{\| \vec{h} \|}\\
 &\rightarrow 0.
 \end{align*}
+$$


 ### Examples
@@ -660,7 +663,7 @@ det(A1), 1/det(A2)
 The technique of *implicit differentiation* is a useful one, as it allows derivatives of more complicated expressions to be found. The main idea, expressed here with three variables is if an equation may be viewed as $F(x,y,z) = c$, $c$ a constant, then $z=\phi(x,y)$ may be viewed as a function of $x$ and $y$. Hence, we can use the chain rule to find: $\partial z / \partial x$ and $\partial z /\partial y$. Let $G(x,y) = \langle x, y, \phi(x,y) \rangle$ and then differentiation $(F \circ G)(x,y) = c$:


-
+$$
 \begin{align*}
 0 &= dF_{G(x,y)} \circ dG_{\langle x, y\rangle}\\
 &= [\frac{\partial F}{\partial x}\quad \frac{\partial F}{\partial y}\quad \frac{\partial F}{\partial z}](G(x,y)) \cdot
@@ -670,6 +673,7 @@ The technique of *implicit differentiation* is a useful one, as it allows deriva
 \frac{\partial \phi}{\partial x} & \frac{\partial \phi}{\partial y}
 \end{bmatrix}.
 \end{align*}
+$$


 Solving yields
@@ -685,14 +689,17 @@ Where the right hand side of each is evaluated at $G(x,y)$.

 When can it be reasonably assumed that such a function $z= \phi(x,y)$ exists?

-
-The [Implicit Function Theorem](https://en.wikipedia.org/wiki/Implicit_function_theorem) provides a statement (slightly abridged here):
+::: {.callout-note icon=false}
+The [Implicit Function Theorem](https://en.wikipedia.org/wiki/Implicit_function_theorem) (slightly abridged)


-> Let $F:R^{n+m} \rightarrow R^m$ be a continuously differentiable function and let $R^{n+m}$ have (compactly defined) coordinates $\langle \vec{x}, \vec{y} \rangle$, Fix a point $\langle \vec{a}, \vec{b} \rangle$ with $F(\vec{a}, \vec{b}) = \vec{0}$. Let $J_{F, \vec{y}}(\vec{a}, \vec{b})$ be the Jacobian restricted to *just* the $y$ variables. ($J$ is $m \times m$.) If this matrix has non-zero determinant (it is invertible), then there exists an open set $U$ containing $\vec{a}$ and a *unique* continuously differentiable function $G: U \subset R^n \rightarrow R^m$ such that $G(\vec{a}) = \vec{b}$, $F(\vec{x}, G(\vec{x})) = 0$ for $\vec x$ in $U$. Moreover, the partial derivatives of $G$ are given by the matrix product:
->
-> $\frac{\partial G}{\partial x_j}(\vec{x}) = - [J_{F, \vec{y}}(x, F(\vec{x}))]^{-1} \left[\frac{\partial F}{\partial x_j}(x, G(\vec{x}))\right].$
+Let $F:R^{n+m} \rightarrow R^m$ be a continuously differentiable function and let $R^{n+m}$ have (compactly defined) coordinates $\langle \vec{x}, \vec{y} \rangle$, Fix a point $\langle \vec{a}, \vec{b} \rangle$ with $F(\vec{a}, \vec{b}) = \vec{0}$. Let $J_{F, \vec{y}}(\vec{a}, \vec{b})$ be the Jacobian restricted to *just* the $y$ variables. ($J$ is $m \times m$.) If this matrix has non-zero determinant (it is invertible), then there exists an open set $U$ containing $\vec{a}$ and a *unique* continuously differentiable function $G: U \subset R^n \rightarrow R^m$ such that $G(\vec{a}) = \vec{b}$, $F(\vec{x}, G(\vec{x})) = 0$ for $\vec x$ in $U$. Moreover, the partial derivatives of $G$ are given by the matrix product:

+$$
+\frac{\partial G}{\partial x_j}(\vec{x}) = - [J_{F, \vec{y}}(x, F(\vec{x}))]^{-1} \left[\frac{\partial F}{\partial x_j}(x, G(\vec{x}))\right].
+$$
+
+:::


 ---
--- a/quarto/differentiable_vector_calculus/vector_valued_functions.qmd
+++ b/quarto/differentiable_vector_calculus/vector_valued_functions.qmd
@@ -800,7 +800,7 @@ Vector-valued functions do not have multiplication or division defined for them,
 For the dot product, the combination $\vec{f}(t) \cdot \vec{g}(t)$ we have a univariate function of $t$, so we know a derivative is well defined. Can it be represented in terms of the vector-valued functions? In terms of the component functions, we have this calculation specific to $n=2$, but that which can be generalized:


-
+$$
 \begin{align*}
 \frac{d}{dt}(\vec{f}(t) \cdot \vec{g}(t)) &=
 \frac{d}{dt}(f_1(t) g_1(t) + f_2(t) g_2(t))\\
@@ -808,6 +808,7 @@ For the dot product, the combination $\vec{f}(t) \cdot \vec{g}(t)$ we have a uni
 &= f_1'(t) g_1(t) + f_2'(t) g_2(t) + f_1(t) g_1'(t)  + f_2(t) g_2'(t)\\
 &= \vec{f}'(t)\cdot \vec{g}(t) + \vec{f}(t) \cdot \vec{g}'(t).
 \end{align*}
+$$


 Suggesting that a product rule like formula applies for dot products.
@@ -839,11 +840,12 @@ diff.(uₛ × vₛ, tₛ) - (diff.(uₛ, tₛ) × vₛ + uₛ × diff.(vₛ, t
 In summary, these two derivative formulas hold for vector-valued functions $R \rightarrow R^n$:


-
+$$
 \begin{align*}
 (\vec{u} \cdot \vec{v})' &= \vec{u}' \cdot \vec{v} + \vec{u} \cdot \vec{v}',\\
 (\vec{u} \times \vec{v})' &= \vec{u}' \times \vec{v} + \vec{u} \times \vec{v}'.
 \end{align*}
+$$


 ##### Application. Circular motion and the tangent vector.
@@ -896,11 +898,12 @@ Combining, Newton states $\vec{a} = -(GM/r^2) \hat{x}$.
 Now to show the first law. Consider $\vec{x} \times \vec{v}$. It is constant, as:


-
+$$
 \begin{align*}
 (\vec{x} \times \vec{v})' &= \vec{x}' \times \vec{v} + \vec{x} \times \vec{v}'\\
 &= \vec{v} \times \vec{v} + \vec{x} \times \vec{a}.
 \end{align*}
+$$


 Both terms are $\vec{0}$, as $\vec{a}$ is parallel to $\vec{x}$ by the above, and clearly $\vec{v}$ is parallel to itself.
@@ -912,34 +915,37 @@ This says, $\vec{x} \times \vec{v} = \vec{c}$ is a constant vector, meaning, the
 Now, by differentiating $\vec{x} = r \hat{x}$ we have:


-
+$$
 \begin{align*}
 \vec{v} &= \vec{x}'\\
 &= (r\hat{x})'\\
 &= r' \hat{x} + r \hat{x}',
 \end{align*}
+$$


 and so


-
+$$
 \begin{align*}
 \vec{c} &= \vec{x} \times \vec{v}\\
 &= (r\hat{x}) \times (r'\hat{x} + r \hat{x}')\\
 &= r^2 (\hat{x} \times \hat{x}').
 \end{align*}
+$$


 From this, we can compute $\vec{a} \times \vec{c}$:


-
+$$
 \begin{align*}
 \vec{a} \times \vec{c} &= (-\frac{GM}{r^2})\hat{x} \times r^2(\hat{x} \times \hat{x}')\\
 &= -GM \hat{x} \times (\hat{x} \times \hat{x}') \\
 &= GM (\hat{x} \times \hat{x}')\times \hat{x}.
 \end{align*}
+$$


 The last line by anti-commutativity.
@@ -948,22 +954,24 @@ The last line by anti-commutativity.
 But, the triple cross product can be simplified through the identify $(\vec{u}\times\vec{v})\times\vec{w} = (\vec{u}\cdot\vec{w})\vec{v} - (\vec{v}\cdot\vec{w})\vec{u}$. So, the above becomes:


-
+$$
 \begin{align*}
 \vec{a} \times \vec{c} &=  GM ((\hat{x}\cdot\hat{x})\hat{x}' - (\hat{x} \cdot \hat{x}')\hat{x})\\
 &= GM (1 \hat{x}' - 0 \hat{x}).
 \end{align*}
+$$


 Now, since $\vec{c}$ is constant, we have:


-
+$$
 \begin{align*}
 (\vec{v} \times \vec{c})' &= (\vec{a} \times \vec{c})\\
 &= GM \hat{x}'\\
 &= (GM\hat{x})'.
 \end{align*}
+$$


 The two sides have the same derivative, hence differ by a constant:
@@ -979,7 +987,7 @@ As $\vec{x}$ and $\vec{v}\times\vec{c}$ lie in the same plane - orthogonal to $\
 Now


-
+$$
 \begin{align*}
 c^2 &= \|\vec{c}\|^2 \\
 &= \vec{c} \cdot \vec{c}\\
@@ -989,6 +997,7 @@ c^2 &= \|\vec{c}\|^2 \\
 &= GMr + r \hat{x} \cdot \vec{d}\\
 &= GMr + rd \cos(\theta).
 \end{align*}
+$$


 Solving, this gives the first law. That is, the radial distance is in the form of an ellipse:
@@ -1397,12 +1406,15 @@ plotly();
 In [Arc length](../integrals/arc_length.html) there is a discussion of how to find the arc length of a parameterized curve in $2$ dimensions. The general case is discussed by [Destafano](https://randomproofs.files.wordpress.com/2010/11/arc_length.pdf) who shows:


-> *Arc-length*: if a curve $C$ is parameterized by a smooth function $\vec{r}(t)$ over an interval $I$, then the arc length of $C$ is:
->
-> $$
-> \int_I \| \vec{r}'(t)  \| dt.
-> $$
+::: {.callout-note icon=false}
+## Arc-length

+If a curve $C$ is parameterized by a smooth function $\vec{r}(t)$ over an interval $I$, then the arc length of $C$ is:
+
+$$
+\int_I \| \vec{r}'(t)  \| dt.
+$$
+:::


 If we associate $\vec{r}'(t)$ with the velocity, then this is the integral of the speed (the magnitude of the velocity).
@@ -1519,12 +1531,13 @@ $$
 As before, but further, we have if $\kappa$ is the curvature and $\tau$ the torsion, these relationships expressing the derivatives with respect to $s$ in terms of the components in the frame:


-
+$$
 \begin{align*}
 \hat{T}'(s) &=                    &\kappa \hat{N}(s)  &\\
 \hat{N}'(s) &= -\kappa \hat{T}(s) &                   &+ \tau \hat{B}(s)\\
 \hat{B}'(s) &=                    &-\tau \hat{N}(s)   &
 \end{align*}
+$$


 These are the [Frenet-Serret](https://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas) formulas.
@@ -1637,12 +1650,13 @@ end
 Levi and Tabachnikov prove in their Proposition 2.4:


-
+$$
 \begin{align*}
 \kappa(u) &= \frac{d\alpha(u)}{du} + \frac{\sin(\alpha(u))}{a},\\
 |\frac{dv}{du}| &= |\cos(\alpha)|, \quad \text{and}\\
 k &= \frac{\tan(\alpha)}{a}.
 \end{align*}
+$$


 The first equation relates the steering angle with the curvature. If the steering angle is not changed ($d\alpha/du=0$) then the curvature is constant and the motion is circular. It will be greater for larger angles (up to $\pi/2$). As the curvature is the reciprocal of the radius, this means the radius of the circular trajectory will be smaller. For the same constant steering angle, the curvature will be smaller for longer wheelbases, meaning the circular trajectory will have a larger radius. For cars, which have similar dynamics, this means longer wheelbase cars will take more room to make a U-turn.
@@ -1657,13 +1671,14 @@ The last equation, relates the curvature of the back wheel track to the steering
 To derive the first one, we have previously noted that when a curve is parameterized by arc length, the curvature is more directly computed: it is the magnitude of the derivative of the tangent vector. The tangent vector is of unit length, when parametrized by arc length. This implies its derivative will be orthogonal. If $\vec{r}(t)$ is a parameterization by arc length, then the curvature formula simplifies as:


-
+$$
 \begin{align*}
 \kappa(s) &= \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{\|\vec{r}'(s)\|^3} \\
 &= \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{1}                           \\
 &= \| \vec{r}'(s) \| \| \vec{r}''(s) \| \sin(\theta)                         \\
 &= 1 \| \vec{r}''(s) \| 1 = \| \vec{r}''(s) \|.
 \end{align*}
+$$


 So in the above, the curvature is $\kappa = \| \vec{F}''(u) \|$ and $k = \|\vec{B}''(v)\|$.
@@ -1691,7 +1706,7 @@ $$
 It must be that the tangent line of $\vec{B}$ is parallel to $\vec{U} \cos(\alpha) + \vec{V} \sin(\alpha)$. To utilize this, we differentiate $\vec{B}$ using the facts that $\vec{U}' = -\kappa \vec{V}$ and $\vec{V}' = \kappa \vec{U}$. These coming from $\vec{U} = \vec{F}'$ and so it's derivative in $u$ has magnitude yielding the curvature, $\kappa$, and direction orthogonal to $\vec{U}$.


-
+$$
 \begin{align*}
 \vec{B}'(u) &= \vec{F}'(u)
 -a \vec{U}' \cos(\alpha) -a \vec{U} (-\sin(\alpha)) \alpha'
@@ -1703,12 +1718,13 @@ a (\kappa) \vec{U} \sin(\alpha) - a \vec{V} \cos(\alpha) \alpha' \\
 + a(\alpha' - \kappa) \sin(\alpha) \vec{U}
 - a(\alpha' - \kappa) \cos(\alpha)\vec{V}.
 \end{align*}
+$$


 Extend the $2$-dimensional vectors to $3$ dimensions, by adding a zero $z$ component, then:


-
+$$
 \begin{align*}
 \vec{0} &= (\vec{U}
 + a(\alpha' - \kappa) \sin(\alpha) \vec{U}
@@ -1721,6 +1737,7 @@ a(\alpha' - \kappa) \cos(\alpha)\vec{V} \times \vec{U} \cos(\alpha) \\
 a(\alpha'-\kappa) \cos^2(\alpha)) \vec{U} \times \vec{V} \\
 &= (\sin(\alpha) + a (\alpha' - \kappa)) \vec{U} \times \vec{V}.
 \end{align*}
+$$


 The terms $\vec{U} \times\vec{U}$ and $\vec{V}\times\vec{V}$ being $\vec{0}$, due to properties of the cross product. This says the scalar part must be $0$, or
@@ -1733,7 +1750,7 @@ $$
 As for the second equation, from the expression for $\vec{B}'(u)$, after setting $a(\alpha'-\kappa) = -\sin(\alpha)$:


-
+$$
 \begin{align*}
 \|\vec{B}'(u)\|^2
 &= \| (1 -\sin(\alpha)\sin(\alpha)) \vec{U} +\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\
@@ -1742,6 +1759,7 @@ As for the second equation, from the expression for $\vec{B}'(u)$, after setting
 &= \cos^2(\alpha)(\cos^2(\alpha) + \sin^2(\alpha))\\
 &= \cos^2(\alpha).
 \end{align*}
+$$


 From this $\|\vec{B}(u)\| = |\cos(\alpha)\|$. But $1 = \|d\vec{B}/dv\| = \|d\vec{B}/du \| \cdot |du/dv|$ and $|dv/du|=|\cos(\alpha)|$ follows.
@@ -1778,11 +1796,12 @@ Consider a parameterization of a curve by arc-length, $\vec\gamma(s) = \langle u
 Consider two nearby points $t$ and $t+\epsilon$ and the intersection of $l_t$ and $l_{t+\epsilon}$. That is, we need points $a$ and $b$ with: $l_t(a) = l_{t+\epsilon}(b)$. Setting the components equal, this is:


-
+$$
 \begin{align*}
 u(t) - av'(t) &= u(t+\epsilon) - bv'(t+\epsilon) \\
 v(t) + au'(t) &= v(t+\epsilon) + bu'(t+\epsilon).
 \end{align*}
+$$


 This is a linear equation in two unknowns ($a$ and $b$) which can be solved. Here is the value for `a`:
@@ -1801,24 +1820,26 @@ out[a]
 Letting $\epsilon \rightarrow 0$ we get an expression for $a$ that will describe the evolute at time $t$ in terms of the function $\gamma$. Looking at the expression above, we can see that dividing the *numerator* by $\epsilon$ and taking a limit will yield $u'(t)^2 + v'(t)^2$. If the *denominator* has a limit after dividing by $\epsilon$, then we can find the description sought. Pursuing this leads to:


-
+$$
 \begin{align*}
 \frac{u'(t) v'(t+\epsilon) - v'(t) u'(t+\epsilon)}{\epsilon}
 &= \frac{u'(t) v'(t+\epsilon) -u'(t)v'(t) + u'(t)v'(t)- v'(t) u'(t+\epsilon)}{\epsilon} \\
 &= \frac{u'(t)(v'(t+\epsilon) -v'(t))}{\epsilon} + \frac{(u'(t)- u'(t+\epsilon))v'(t)}{\epsilon},
 \end{align*}
+$$


 which in the limit will give $u'(t)v''(t) - u''(t) v'(t)$. All told, in the limit as $\epsilon \rightarrow 0$ we get


-
+$$
 \begin{align*}
 a &= \frac{u'(t)^2 + v'(t)^2}{u'(t)v''(t) - v'(t) u''(t)} \\
 &= 1/(\|\vec\gamma'\|\kappa) \\
 &= 1/(\|\hat{T}\|\kappa) \\
 &= 1/\kappa,
 \end{align*}
+$$


 with $\kappa$ being the curvature of the planar curve. That is, the evolute of $\vec\gamma$ is described by:
@@ -1844,13 +1865,14 @@ plot_parametric!(0..2pi, t -> (rₑ₃(t) + Normal(rₑ₃, t)/curvature(rₑ₃
 We computed the above illustration using $3$ dimensions (hence the use of `[1:2]...`) as the curvature formula is easier to express. Recall, the curvature also appears in the [Frenet-Serret](https://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas) formulas: $d\hat{T}/ds = \kappa \hat{N}$ and $d\hat{N}/ds = -\kappa \hat{T}+ \tau \hat{B}$. In a planar curve, as under consideration, the binormal is $\vec{0}$. This allows the computation of $\vec\beta(s)'$:


-
+$$
 \begin{align*}
 \vec{\beta}' &= \frac{d(\vec\gamma + (1/ \kappa) \hat{N})}{ds}\\
 &= \hat{T} + (-\frac{\kappa '}{\kappa ^2}\hat{N} + \frac{1}{\kappa} \hat{N}')\\
 &= \hat{T} - \frac{\kappa '}{\kappa ^2}\hat{N} + \frac{1}{\kappa} (-\kappa \hat{T})\\
 &= - \frac{\kappa '}{\kappa ^2}\hat{N}.
 \end{align*}
+$$


 We see $\vec\beta'$ is zero (the curve is non-regular) when $\kappa'(s) = 0$. The curvature changes from increasing to decreasing, or vice versa at each of the $4$ crossings of the major and minor axes - there are $4$ non-regular points, and we see $4$ cusps in the evolute.
@@ -1915,11 +1937,12 @@ $$
 If $\vec\gamma(s)$ is parameterized by arc length, then this simplifies quite a bit, as the unit tangent is just $\vec\gamma'(s)$ and the remaining arc length just $(s-a)$:


-
+$$
 \begin{align*}
 \vec\beta_a(s) &= \vec\gamma(s) - \vec\gamma'(s) (s-a) \\
 &=\vec\gamma(s) - \hat{T}_{\vec\gamma}(s)(s-a).\quad (a \text{ is the arc-length parameter})
 \end{align*}
+$$


 With this characterization, we see several properties:
@@ -1940,11 +1963,12 @@ $$
 In the following we show that:


-
+$$
 \begin{align*}
 \kappa_{\vec\beta_a}(s) &= 1/(s-a),\\
 \hat{N}_{\vec\beta_a}(s) &= \hat{T}_{\vec\beta_a}'(s)/\|\hat{T}_{\vec\beta_a}'(s)\| = -\hat{T}_{\vec\gamma}(s).
 \end{align*}
+$$


 The first shows in a different way that when $s=a$ the curve is not regular, as the curvature fails to exists. In the above figure, when the involute touches $\vec\gamma$, there will be a cusp.
@@ -1953,7 +1977,7 @@ The first shows in a different way that when $s=a$ the curve is not regular, as
 With these two identifications and using $\vec\gamma'(s) = \hat{T}_{\vec\gamma(s)}$, we have the evolute simplifies to


-
+$$
 \begin{align*}
 \vec\beta_a(s) + \frac{1}{\kappa_{\vec\beta_a}(s)}\hat{N}_{\vec\beta_a}(s)
 &=
@@ -1962,6 +1986,7 @@ With these two identifications and using $\vec\gamma'(s) = \hat{T}_{\vec\gamma(s
 \vec\gamma(s) + \hat{T}_{\vec\gamma}(s)(s-a) + \frac{1}{1/(s-a)} (-\hat{T}_{\vec\gamma}(s)) \\
 &= \vec\gamma(s).
 \end{align*}
+$$


 That is the evolute of an involute of $\vec\gamma(s)$ is $\vec\gamma(s)$.
@@ -1970,12 +1995,13 @@ That is the evolute of an involute of $\vec\gamma(s)$ is $\vec\gamma(s)$.
 We have:


-
+$$
 \begin{align*}
 \beta_a(s) &= \vec\gamma - \vec\gamma'(s)(s-a)\\
 \beta_a'(s) &= -\kappa_{\vec\gamma}(s)(s-a)\hat{N}_{\vec\gamma}(s)\\
 \beta_a''(s) &= (-\kappa_{\vec\gamma}(s)(s-a))' \hat{N}_{\vec\gamma}(s) + (-\kappa_{\vec\gamma}(s)(s-a))(-\kappa_{\vec\gamma}\hat{T}_{\vec\gamma}(s)),
 \end{align*}
+$$


 the last line by the Frenet-Serret  formula for *planar* curves which show $\hat{T}'(s) = \kappa(s) \hat{N}$ and $\hat{N}'(s) = -\kappa(s)\hat{T}(s)$.
@@ -1984,11 +2010,12 @@ the last line by the Frenet-Serret  formula for *planar* curves which show $\hat
 To compute the curvature of $\vec\beta_a$, we need to compute both:


-
+$$
 \begin{align*}
 \| \vec\beta' \|^3 &= |\kappa^3 (s-a)^3|\\
 \| \vec\beta' \times \vec\beta'' \| &= |\kappa(s)^3 (s-a)^2|,
 \end{align*}
+$$


 the last line using both $\hat{N}\times\hat{N} = \vec{0}$ and $\|\hat{N}\times\hat{T}\| = 1$. The curvature then is $\kappa_{\vec\beta_a}(s) = 1/(s-a)$.
@@ -2672,13 +2699,14 @@ radioq(choices, answ)
 The evolute comes from the formula $\vec\gamma(T) - (1/\kappa(t)) \hat{N}(t)$. For hand computation, this formula can be explicitly given by two components $\langle X(t), Y(t) \rangle$ through:


-
+$$
 \begin{align*}
 r(t) &= x'(t)^2 + y'(t)^2\\
 k(t) &= x'(t)y''(t) - x''(t) y'(t)\\
 X(t) &= x(t) - y'(t) r(t)/k(t)\\
 Y(t) &= y(t) + x'(t) r(t)/k(t)
 \end{align*}
+$$


 Let $\vec\gamma(t) = \langle t, t^2 \rangle = \langle x(t), y(t)\rangle$ be a parameterization of a parabola.
--- a/quarto/differentiable_vector_calculus/vectors.qmd
+++ b/quarto/differentiable_vector_calculus/vectors.qmd
@@ -443,12 +443,13 @@ $$

 The left hand sides are in the form of a dot product, in this case $\langle a,b \rangle \cdot \langle x, y\rangle$ and  $\langle a,b,c \rangle \cdot \langle x, y, z\rangle$ respectively. When there is a system of equations, something like:

-
+$$
 \begin{align*}
 3x  &+ 4y  &- 5z &= 10\\
 3x  &- 5y  &+ 7z &= 11\\
 -3x &+ 6y  &+ 9z &= 12,
 \end{align*}
+$$

 Then we might think of $3$ vectors $\langle 3,4,-5\rangle$, $\langle 3,-5,7\rangle$, and $\langle -3,6,9\rangle$ being dotted with $\langle x,y,z\rangle$. Mathematically, matrices and their associated algebra are used to represent this. In this example, the system of equations above would be represented by a matrix and two vectors: