em dash; sentence case

This commit is contained in:
jverzani
2025-07-27 15:26:00 -04:00
parent c3b221cd29
commit 33c6e62d68
59 changed files with 385 additions and 243 deletions

View File

@@ -184,7 +184,7 @@ plot(exp(-1/2)*exp(x^2/2), x0, 2)
plot!(xs, ys)
```
Not bad. We wouldn't expect this to be exact - due to the concavity of the solution, each step is an underestimate. However, we see it is an okay approximation and would likely be better with a smaller $h$. A topic we pursue in just a bit.
Not bad. We wouldn't expect this to be exact---due to the concavity of the solution, each step is an underestimate. However, we see it is an okay approximation and would likely be better with a smaller $h$. A topic we pursue in just a bit.
Rather than type in the above command each time, we wrap it all up in a function. The inputs are $n$, $a=x_0$, $b=x_n$, $y_0$, and, most importantly, $F$. The output is massaged into a function through a call to `linterp`, rather than two vectors. The `linterp` function[^Interpolations] we define below just finds a function that linearly interpolates between the points and is `NaN` outside of the range of the $x$ values:
@@ -263,7 +263,7 @@ Each step introduces an error. The error in one step is known as the *local trun
The total error, or more commonly, *global truncation error*, is the error between the actual answer and the approximate answer at the end of the process. It reflects an accumulation of these local errors. This error is *bounded* by a constant times $h$. Since it gets smaller as $h$ gets smaller in direct proportion, the Euler method is called *first order*.
Other, somewhat more complicated, methods have global truncation errors that involve higher powers of $h$ - that is for the same size $h$, the error is smaller. In analogy is the fact that Riemann sums have error that depends on $h$, whereas other methods of approximating the integral have smaller errors. For example, Simpson's rule had error related to $h^4$. So, the Euler method may not be employed if there is concern about total resources (time, computer, ...), it is important for theoretical purposes in a manner similar to the role of the Riemann integral.
Other, somewhat more complicated, methods have global truncation errors that involve higher powers of $h$---that is for the same size $h$, the error is smaller. In analogy is the fact that Riemann sums have error that depends on $h$, whereas other methods of approximating the integral have smaller errors. For example, Simpson's rule had error related to $h^4$. So, the Euler method may not be employed if there is concern about total resources (time, computer, ...), it is important for theoretical purposes in a manner similar to the role of the Riemann integral.
In the examples, we will see that for many problems the simple Euler method is satisfactory, but not always so. The task of numerically solving differential equations is not a one-size-fits-all one. In the following, a few different modifications are presented to the basic Euler method, but this just scratches the surface of the topic.
@@ -648,7 +648,7 @@ plot(euler2(x0, xn, y0, yp0, 360), 0, 4T)
plot!(x -> pi/4*cos(sqrt(g/l)*x), 0, 4T)
```
Even now, we still see that something seems amiss, though the issue is not as dramatic as before. The oscillatory nature of the pendulum is seen, but in the Euler solution, the amplitude grows, which would necessarily mean energy is being put into the system. A familiar instance of a pendulum would be a child on a swing. Without pumping the legs - putting energy in the system - the height of the swing's arc will not grow. Though we now have oscillatory motion, this growth indicates the solution is still not quite right. The issue is likely due to each step mildly overcorrecting and resulting in an overall growth. One of the questions pursues this a bit further.
Even now, we still see that something seems amiss, though the issue is not as dramatic as before. The oscillatory nature of the pendulum is seen, but in the Euler solution, the amplitude grows, which would necessarily mean energy is being put into the system. A familiar instance of a pendulum would be a child on a swing. Without pumping the legs---putting energy in the system---the height of the swing's arc will not grow. Though we now have oscillatory motion, this growth indicates the solution is still not quite right. The issue is likely due to each step mildly overcorrecting and resulting in an overall growth. One of the questions pursues this a bit further.
## Questions
@@ -794,7 +794,7 @@ Modify the `euler2` function to implement the Euler-Cromer method. What do you s
#| hold: true
#| echo: false
choices = [
"The same as before - the amplitude grows",
"The same as before---the amplitude grows",
"The solution is identical to that of the approximation found by linearization of the sine term",
"The solution has a constant amplitude, but its period is slightly *shorter* than that of the approximate solution found by linearization",
"The solution has a constant amplitude, but its period is slightly *longer* than that of the approximate solution found by linearization"]

View File

@@ -149,7 +149,7 @@ $$
U'(t) = -r U(t), \quad U(0) = U_0.
$$
This shows that the rate of change of $U$ depends on $U$. Large positive values indicate a negative rate of change - a push back towards the origin, and large negative values of $U$ indicate a positive rate of change - again, a push back towards the origin. We shouldn't be surprised to either see a steady decay towards the origin, or oscillations about the origin.
This shows that the rate of change of $U$ depends on $U$. Large positive values indicate a negative rate of change---a push back towards the origin, and large negative values of $U$ indicate a positive rate of change---again, a push back towards the origin. We shouldn't be surprised to either see a steady decay towards the origin, or oscillations about the origin.
What will we find? This equation is different from the previous two equations, as the function $U$ appears on both sides. However, we can rearrange to get:
@@ -177,7 +177,7 @@ $$
In words, the initial difference in temperature of the object and the environment exponentially decays to $0$.
That is, as $t > 0$ goes to $\infty$, the right hand will go to $0$ for $r > 0$, so $T(t) \rightarrow T_a$ - the temperature of the object will reach the ambient temperature. The rate of this is largest when the difference between $T(t)$ and $T_a$ is largest, so when objects are cooling the statement "hotter things cool faster" is appropriate.
That is, as $t > 0$ goes to $\infty$, the right hand will go to $0$ for $r > 0$, so $T(t) \rightarrow T_a$---the temperature of the object will reach the ambient temperature. The rate of this is largest when the difference between $T(t)$ and $T_a$ is largest, so when objects are cooling the statement "hotter things cool faster" is appropriate.
A graph of the solution for $T_0=200$ and $T_a=72$ and $r=1/2$ is made as follows. We've added a few line segments from the defining formula, and see that they are indeed tangent to the solution found for the differential equation.
@@ -403,7 +403,7 @@ To finish, we call `dsolve` to find a solution (if possible):
out = dsolve(eqn)
```
This answer - to a first-order equation - has one free constant, `C₁`, which can be solved for from an initial condition. We can see that when $a > 0$, as $x$ goes to positive infinity the solution goes to $1$, and when $x$ goes to negative infinity, the solution goes to $0$ and otherwise is trapped in between, as expected.
This answer---to a first-order equation---has one free constant, `C₁`, which can be solved for from an initial condition. We can see that when $a > 0$, as $x$ goes to positive infinity the solution goes to $1$, and when $x$ goes to negative infinity, the solution goes to $0$ and otherwise is trapped in between, as expected.
The limits are confirmed by investigating the limits of the right-hand:
@@ -618,6 +618,7 @@ nothing
```
![The cables of an unloaded suspension bridge have a different shape than a loaded suspension bridge. As seen, the cables in this [figure](https://www.brownstoner.com/brooklyn-life/verrazano-narrows-bridge-anniversary-historic-photos/) would be modeled by a catenary.](./figures/verrazano-narrows-bridge-anniversary-historic-photos-2.jpeg)
---
@@ -641,7 +642,7 @@ $$
x''(t) = 0, \quad y''(t) = -g.
$$
That is, the $x$ position - where no forces act - has $0$ acceleration, and the $y$ position - where the force of gravity acts - has constant acceleration, $-g$, where $g=9.8m/s^2$ is the gravitational constant. These equations can be solved to give:
That is, the $x$ position---where no forces act---has $0$ acceleration, and the $y$ position---where the force of gravity acts---has constant acceleration, $-g$, where $g=9.8m/s^2$ is the gravitational constant. These equations can be solved to give:
$$
@@ -957,7 +958,7 @@ radioq(choices, answ)
##### Question
The example with projectile motion in a medium has a parameter $\gamma$ modeling the effect of air resistance. If `y` is the answer - as would be the case if the example were copy-and-pasted in - what can be said about `limit(y, gamma=>0)`?
The example with projectile motion in a medium has a parameter $\gamma$ modeling the effect of air resistance. If `y` is the answer---as would be the case if the example were copy-and-pasted in---what can be said about `limit(y, gamma=>0)`?
```{julia}
@@ -966,7 +967,7 @@ The example with projectile motion in a medium has a parameter $\gamma$ modeling
choices = [
"The limit is a quadratic polynomial in `x`, mirroring the first part of that example.",
"The limit does not exist, but the limit to `oo` gives a quadratic polynomial in `x`, mirroring the first part of that example.",
"The limit does not exist -- there is a singularity -- as seen by setting `gamma=0`."
"The limit does not exist---there is a singularity---as seen by setting `gamma=0`."
]
answ = 1
radioq(choices, answ)

View File

@@ -25,14 +25,16 @@ book:
page-footer: "Copyright 2022-25, John Verzani"
chapters:
- index.qmd
- part: basics.qmd
chapters:
- basics/calculator.qmd
- basics/variables.qmd
- basics/numbers_types.qmd
- basics/logical_expressions.qmd
- basics/vectors.qmd
- basics/ranges.qmd
- part: precalc.qmd
chapters:
- precalc/calculator.qmd
- precalc/variables.qmd
- precalc/numbers_types.qmd
- precalc/logical_expressions.qmd
- precalc/vectors.qmd
- precalc/ranges.qmd
- precalc/functions.qmd
- precalc/plotting.qmd
- precalc/transformations.qmd

3
quarto/basics.qmd Normal file
View File

@@ -0,0 +1,3 @@
# Mathematical basics
This chapter introduces some mathematical basics and their counterparts within the `Julia` programming language.

View File

@@ -0,0 +1,15 @@
[deps]
CalculusWithJulia = "a2e0e22d-7d4c-5312-9169-8b992201a882"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
Measures = "442fdcdd-2543-5da2-b0f3-8c86c306513e"
Mustache = "ffc61752-8dc7-55ee-8c37-f3e9cdd09e70"
PlotlyBase = "a03496cd-edff-5a9b-9e67-9cda94a718b5"
PlotlyKaleido = "f2990250-8cf9-495f-b13a-cce12b45703c"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Primes = "27ebfcd6-29c5-5fa9-bf4b-fb8fc14df3ae"
QuizQuestions = "612c44de-1021-4a21-84fb-7261cf5eb2d4"
SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
TextWrap = "b718987f-49a8-5099-9789-dcd902bef87d"

View File

@@ -0,0 +1 @@
../basics.qmd

View File

@@ -244,7 +244,7 @@ With the Google Calculator, typing `1 + 2 x 3 =` will give the value $7$, but *i
In `Julia`, the entire expression is typed in before being evaluated, so the usual conventions of mathematics related to the order of operations may be used. These are colloquially summarized by the acronym [PEMDAS](http://en.wikipedia.org/wiki/Order_of_operations).
> **PEMDAS**. This acronym stands for Parentheses, Exponents, Multiplication, Division, Addition, Subtraction. The order indicates which operation has higher precedence, or should happen first. This isn't exactly the case, as "M" and "D" have the same precedence, as do "A" and "S". In the case of two operations with equal precedence, *associativity* is used to decide which to do. For the operations `-`, `/` the associativity is left to right, as in the left one is done first, then the right. However, `^` has right associativity, so `4^3^2` is `4^(3^2)` and not `(4^3)^2` (Be warned that some calculators - and spread sheets, such as Excel - will treat this expression with left associativity). But, `+` and `*` don't have associativity, so `1+2+3` can be `(1+2)+3` or `1+(2+3)`.
> **PEMDAS**. This acronym stands for Parentheses, Exponents, Multiplication, Division, Addition, Subtraction. The order indicates which operation has higher precedence, or should happen first. This isn't exactly the case, as "M" and "D" have the same precedence, as do "A" and "S". In the case of two operations with equal precedence, *associativity* is used to decide which to do. For the operations `-`, `/` the associativity is left to right, as in the left one is done first, then the right. However, `^` has right associativity, so `4^3^2` is `4^(3^2)` and not `(4^3)^2` (Be warned that some calculators - and spread sheets, such as Excel - will treat this expression with left associativity). But, `+` and `*` don't have associativity, so `1+2+3` can be `(1+2)+3` or `1+(2+3)`.

View File

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 50 KiB

View File

Before

Width:  |  Height:  |  Size: 114 KiB

After

Width:  |  Height:  |  Size: 114 KiB

View File

Before

Width:  |  Height:  |  Size: 10 KiB

After

Width:  |  Height:  |  Size: 10 KiB

16
quarto/basics/make_pdf.jl Normal file
View File

@@ -0,0 +1,16 @@
module Make
# makefile for generating typst pdfs
# per directory usage
dir = "basics"
files = ("calculator",
"variables",
"numbers_types",
"logical_expressions",
"vectors",
"ranges",
)
include("../_make_pdf.jl")
main()
end

View File

@@ -1,4 +1,4 @@
# Ranges and Sets
# Ranges and sets
{{< include ../_common_code.qmd >}}

View File

@@ -231,7 +231,7 @@ The distinction between ``=`` versus `=` is important and one area where common
## Context
The binding of a value to a variable name happens within some context. When a variable is assigned or referenced, the scope of the variable -- the region of code where it is accessible -- is taken into consideration.
The binding of a value to a variable name happens within some context. When a variable is assigned or referenced, the scope of the variable---the region of code where it is accessible---is taken into consideration.
For our simple illustrations, we are assigning values, as though they were typed at the command line. This stores the binding in the `Main` module. `Julia` looks for variables in this module when it encounters an expression and the value is substituted. Other uses, such as when variables are defined within a function, involve different contexts which may not be visible within the `Main` module.

View File

@@ -434,7 +434,7 @@ Tuples are fixed-length containers where there is no expectation or enforcement
While a vector is formed by placing comma-separated values within a `[]` pair (e.g., `[1,2,3]`), a tuple is formed by placing comma-separated values within a `()` pair. A tuple of length $1$ uses a convention of a trailing comma to distinguish it from a parenthesized expression (e.g. `(1,)` is a tuple, `(1)` is just the value `1`).
Vectors and tuples can appear at the same time: a vector of tuples--each of length $n$--can be used in plotting to specify points.
Vectors and tuples can appear at the same time: a vector of tuples---each of length $n$---can be used in plotting to specify points.
:::{.callout-note}
## Well, actually...
@@ -471,7 +471,7 @@ The values in a named tuple can be accessed using the "dot" notation:
nt.x1
```
Alternatively, the index notation--using a *symbol* for the name--can be used:
Alternatively, the index notation---using a *symbol* for the name---can be used:
```{julia}
nt[:x1]
@@ -608,7 +608,7 @@ There will be an error with `only` should the container not have just one elemen
### Mutating values
Vectors and matrices can have their elements changed or mutated; tuples can not. The process is similar to assignment--using an equals sign--but the left hand side has indexing notation to reference which values within the container are to be updated.
Vectors and matrices can have their elements changed or mutated; tuples can not. The process is similar to assignment---using an equals sign---but the left hand side has indexing notation to reference which values within the container are to be updated.
To change the last element of `v` to `0` we have:
@@ -653,7 +653,7 @@ Arrays, and hence vectors and matrices have an element type given by `eltype` (t
eltype(v), eltype(t), eltype(m)
```
(The element type of the tuple is `Int64`, but this is only because of this particular tuple. Tuples are typically heterogeneous containers--not homogeneous like vectors--and do not expect to have a common type. The `NTuple` type is for tuples with elements of the same type.)
(The element type of the tuple is `Int64`, but this is only because of this particular tuple. Tuples are typically heterogeneous containers---not homogeneous like vectors---and do not expect to have a common type. The `NTuple` type is for tuples with elements of the same type.)
### Modifying the length of a container
@@ -665,7 +665,7 @@ Two key methods for queues are `push!` and `pop!`. We `push!` elements onto the
push!(v, 5)
```
The output is expected -- `5` was added to the end of `v`. What might not be expected is the underlying `v` is changed without assignment. (Actually `mutated`, the underlying container assigned to the symbol `v` is extended, not replaced.)
The output is expected---`5` was added to the end of `v`. What might not be expected is the underlying `v` is changed without assignment. (Actually `mutated`, the underlying container assigned to the symbol `v` is extended, not replaced.)
:::{.callout-note}
## Trailing exclamation point convention
@@ -699,7 +699,7 @@ end
tot
```
The `for` loop construct is central in many programming languages; in `Julia` for loops are very performant and very flexible, however, they are more verbose than needed. (In the above example we had to initialize an accumulator and then write three lines for the loop, whereas `sum(v)` would do the same--and in this case more flexibly, with just a single call.) Alternatives are usually leveraged--we mention a few.
The `for` loop construct is central in many programming languages; in `Julia` for loops are very performant and very flexible, however, they are more verbose than needed. (In the above example we had to initialize an accumulator and then write three lines for the loop, whereas `sum(v)` would do the same---and in this case more flexibly, with just a single call.) Alternatives are usually leveraged---we mention a few.
Iterating over a vector can be done by *value*, as above, or by *index*. For the latter the `eachindex` method creates an iterable for the indices of the container. For rectangular objects, like matrices, there are also many uses for `eachrow` and `eachcol`, though not in these notes.
@@ -847,7 +847,7 @@ The `map` function can also be used in combination with `reduce`, a reduction. R
sum(map(sin, xs))
```
This has a performance drawback--there are two passes through the container, one to apply `sin` another to add.
This has a performance drawback---there are two passes through the container, one to apply `sin` another to add.
The `mapreduce` function combines the map and reduce operations in one pass. It takes a third argument to reduce by in the second position. This is a *binary* operator. So this combination will map `sin` over `xs` and then add the results up:
@@ -924,7 +924,7 @@ The latter using *splatting* to iterate over each value in `xs` and pass it to `
### Predicate functions
A few reductions work with *predicate* functions--those that return `true` or `false`. Let's use `iseven` as an example, which tests if a number is even.
A few reductions work with *predicate* functions---those that return `true` or `false`. Let's use `iseven` as an example, which tests if a number is even.
We can check if *all* the alements of a container are even or if *any* of the elements of a container are even with `all` and `even`:

View File

@@ -1,4 +1,4 @@
# Curve Sketching
# Curve sketching
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# Implicit Differentiation
# Implicit differentiation
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# L'Hospital's Rule
# L'Hospital's rule
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# The mean value theorem for differentiable functions.
# The mean value theorem for differentiable functions
{{< include ../_common_code.qmd >}}

View File

@@ -368,7 +368,7 @@ $$
So we should consider `f(x_n)` an *approximate zero* when it is on the scale of $f'(\alpha) \cdot \alpha \delta$. That $\alpha$ factor means we consider a *relative* tolerance for `f`.
> For checking if $f(x_n) \approx 0$ both a relative and absolute error should be used--the relative error involving the size of $x_n$.
> For checking if $f(x_n) \approx 0$ both a relative and absolute error should be used---the relative error involving the size of $x_n$.
A good condition to check if `f(x_n)` is small is

View File

@@ -10,15 +10,6 @@ This section uses the `TermInterface` add-on package.
using TermInterface
```
```{julia}
#| echo: false
const frontmatter = (
title = "Symbolic derivatives",
description = "Calculus with Julia: Symbolic derivatives",
tags = ["CalculusWithJulia", "derivatives", "symbolic derivatives"],
);
```
---

View File

@@ -1,4 +1,4 @@
# Taylor Polynomials and other Approximating Polynomials
# Taylor polynomials and other approximating polynomials
{{< include ../_common_code.qmd >}}
@@ -104,7 +104,7 @@ $$
tl(x) = f(c) + f'(c) \cdot(x - c).
$$
The key is the term multiplying $(x-c)$ -- for the secant line this is an approximation to the related term for the tangent line. That is, the secant line approximates the tangent line, which is the linear function that best approximates the function at the point $(c, f(c))$.
The key is the term multiplying $(x-c)$---for the secant line this is an approximation to the related term for the tangent line. That is, the secant line approximates the tangent line, which is the linear function that best approximates the function at the point $(c, f(c))$.
This is quantified by the *mean value theorem* which states under our assumptions on $f(x)$ that there exists some $\xi$ between $x$ and $c$ for which:
@@ -194,7 +194,7 @@ function divided_differences(f, x, xs...)
end
```
In the following--even though it is *type piracy*--by adding a `getindex` method, we enable the `[]` notation of Newton to work with symbolic functions, like `u()` defined below, which is used in place of $f$:
In the following---even though it is *type piracy*---by adding a `getindex` method, we enable the `[]` notation of Newton to work with symbolic functions, like `u()` defined below, which is used in place of $f$:
```{julia}
@@ -215,7 +215,7 @@ Now, let's look at:
ex₂ = u[c, c+h, c+2h]
```
If multiply by $2$ and simplify, a discrete approximation for the second derivative--the second order forward [difference equation](http://tinyurl.com/n4235xy)--is seen:
If multiply by $2$ and simplify, a discrete approximation for the second derivative---the second order forward [difference equation](http://tinyurl.com/n4235xy)---is seen:
```{julia}
simplify(2ex₂)
@@ -794,7 +794,7 @@ This is re-expressed as $2s + s \cdot p$ with $p$ given by:
```{julia}
cancel((a_b - 2s)/s)
p = cancel((a_b - 2s)/s)
```
Now, $2s = m - s\cdot m$, so the above can be reworked to be $\log(1+m) = m - s\cdot(m-p)$.
@@ -807,7 +807,7 @@ How big can the error be between this *approximations* and $\log(1+m)$? The expr
```{julia}
Max = (v/(2+v))(v => sqrt(2) - 1)
Max = (x/(2+x))(x => sqrt(2) - 1)
```
The error term is like $2/19 \cdot \xi^{19}$ which is largest at this value of $M$. Large is relative - it is really small:

View File

@@ -1,4 +1,4 @@
# Matrix Calculus
# Matrix calculus
This section illustrates a more general setting for taking derivatives, that unifies the different expositions taken prior.
@@ -74,7 +74,7 @@ Additionally, many other set of objects form vector spaces. Certain families of
Let's take differentiable functions as an example. These form a vector space as the derivative of a linear combination of differentiable functions is defined through the simplest derivative rule: $[af(x) + bg(x)]' = a[f(x)]' + b[g(x)]'$. If $f$ and $g$ are differentiable, then so is $af(x)+bg(x)$.
A finite vector space is described by a *basis* -- a minimal set of vectors needed to describe the space, after consideration of linear combinations. For some typical vector spaces, this is the set of special vectors with $1$ as one of the entries, and $0$ otherwise.
A finite vector space is described by a *basis*---a minimal set of vectors needed to describe the space, after consideration of linear combinations. For some typical vector spaces, this is the set of special vectors with $1$ as one of the entries, and $0$ otherwise.
A key fact about a basis for a finite vector space is every vector in the vector space can be expressed *uniquely* as a linear combination of the basis vectors. The set of numbers used in the linear combination, along with an order to the basis, means an element in a finite vector space can be associated with a unique coordinate vector.
@@ -88,7 +88,7 @@ Vectors and matrices have properties that are generalizations of the real number
* Viewing a vector as a matrix is possible. The association chosen here is common and is through a *column* vector.
* The *transpose* of a matrix comes by permuting the rows and columns. The transpose of a column vector is a row vector, so $v\cdot w = v^T w$, where we use a superscript $T$ for the transpose. The transpose of a product, is the product of the transposes -- reversed: $(AB)^T = B^T A^T$; the tranpose of a transpose is an identity operation: $(A^T)^T = A$; the inverse of a transpose is the tranpose of the inverse: $(A^{-1})^T = (A^T)^{-1}$.
* The *transpose* of a matrix comes by permuting the rows and columns. The transpose of a column vector is a row vector, so $v\cdot w = v^T w$, where we use a superscript $T$ for the transpose. The transpose of a product, is the product of the transposes---reversed: $(AB)^T = B^T A^T$; the tranpose of a transpose is an identity operation: $(A^T)^T = A$; the inverse of a transpose is the tranpose of the inverse: $(A^{-1})^T = (A^T)^{-1}$.
* Matrices for which $A = A^T$ are called symmetric.
@@ -231,7 +231,7 @@ Various differentiation rules are still available such as the sum, product, and
### Sum and product rules for the derivative
Using the differential notation -- which implicitly ignores higher order terms as they vanish in a limit -- the sum and product rules can be derived.
Using the differential notation---which implicitly ignores higher order terms as they vanish in a limit---the sum and product rules can be derived.
For the sum rule, let $f(x) = g(x) + h(x)$. Then
@@ -377,7 +377,7 @@ Multiplying left to right (the first) is called reverse mode; multiplying right
The reason comes down to the shape of the matrices. To see, we need to know that matrix multiplication of an $m \times q$ matrix times a $q \times n$ matrix takes an order of $mqn$ operations.
When $m=1$, the derviative is a product of matrices of size $n\times j$, $j\times k$, and $k \times 1$ yielding a matrix of size $n \times 1$ matching the function dimension.
When $m=1$, the derivative is a product of matrices of size $n\times j$, $j\times k$, and $k \times 1$ yielding a matrix of size $n \times 1$ matching the function dimension.
The operations involved in multiplication from left to right can be quantified. The first operation takes $njk$ operation leaving an $n\times k$ matrix, the next multiplication then takes another $nk1$ operations or $njk + nk$ together.
@@ -435,7 +435,7 @@ That is $f'(A)$ is the operator $f'(A)[\delta A] = A \delta A + \delta A A$. (Th
Alternatively, we can identify $A$ through its
components, as a vector in $R^{n^2}$ and then leverage the Jacobian.
One such identification is vectorization -- consecutively stacking the
One such identification is vectorization---consecutively stacking the
column vectors into a single vector. In `Julia` the `vec` function does this
operation:
@@ -444,7 +444,7 @@ operation:
vec(A)
```
The stacking by column follows how `Julia` stores matrices and how `Julia` references a matrices entries by linear index:
The stacking by column follows how `Julia` stores matrices and how `Julia` references entries in a matrix by linear index:
```{julia}
vec(A) == [A[i] for i in eachindex(A)]
@@ -562,7 +562,7 @@ all(l == r for (l, r) ∈ zip(L, R))
----
Now to use this relationship to recognize $df = A dA + dA A$ with the Jacobian computed from $\text{vec}{f(a)}$.
Now to use this relationship to recognize $df = A dA + dA A$ with the Jacobian computed from $\text{vec}(f(a))$.
We have $\text{vec}(A dA + dA A) = \text{vec}(A dA) + \text{vec}(dA A)$, by obvious linearity of $\text{vec}$. Now inserting an identity matrix, $I$, which is symmteric, in a useful spot we have:
@@ -683,7 +683,7 @@ det(I + dA) - det(I)
## The adjoint method
The chain rule brings about a series of products. The adjoint method illustrated below, shows how to approach the computation of the series in a direction that minimizes the computational cost, illustrating why reverse mode is preferred to forward mode when a scalar function of several variables is considered.
The chain rule brings about a series of products. The adjoint method illustrated by @BrightEdelmanJohnson and summarize below, shows how to approach the computation of the series in a direction that minimizes the computational cost, illustrating why reverse mode is preferred to forward mode when a scalar function of several variables is considered.
@BrightEdelmanJohnson consider the derivative of
@@ -778,9 +778,9 @@ Here $v$ can be solved for by taking adjoints (as before). Let $A = \partial h/\
## Second derivatives, Hessian
@CarlssonNikitinTroedssonWendt
We reference a theorem presented by [Carlsson, Nikitin, Troedsson, and Wendt](https://arxiv.org/pdf/2502.03070v1) for exposition with some modification
We reference a theorem presented by @CarlssonNikitinTroedssonWendt for exposition with some modification
::: {.callout-note appearance="minimal"}
Theorem 1. Let $f:X \rightarrow Y$, where $X,Y$ are finite dimensional *inner product* spaces with elements in $R$. Suppose $f$ is smooth (a certain number of derivatives). Then for each $x$ in $X$ there exists a unique linear operator, $f'(x)$, and a unique *bilinear* *symmetric* operator $f'': X \oplus X \rightarrow Y$ such that
@@ -804,7 +804,7 @@ $$
\begin{align*}
f(x + dx) &= f(x) +
\frac{\partial f}{\partial x_1} dx_1 + \frac{\partial f}{\partial x_2} dx_2\\
&+ \frac{1}{2}\left(
&{+} \frac{1}{2}\left(
\frac{\partial^2 f}{\partial x_1^2}dx_1^2 +
\frac{\partial^2 f}{\partial x_1 \partial x_2}dx_1dx_2 +
\frac{\partial^2 f}{\partial x_2^2}dx_2^2
@@ -832,7 +832,7 @@ $$
$H$ being the *Hessian* with entries $H_{ij} = \frac{\partial f}{\partial x_i \partial x_j}$.
This formula -- $f(x+dx)-f(x) \approx f'(x)dx + dx^T H dx$ -- is valid for any $n$, showing $n=2$ was just for ease of notation when expressing in the coordinates and not as matrices.
This formula---$f(x+dx)-f(x) \approx f'(x)dx + dx^T H dx$---is valid for any $n$, showing $n=2$ was just for ease of notation when expressing in the coordinates and not as matrices.
By uniqueness, we have under these assumptions that the Hessian is *symmetric* and the expression $dx^T H dx$ is a *bilinear* form, which we can identify as $f''(x)[dx,dx]$.
@@ -909,24 +909,23 @@ $$
&= \left(
\text{det}(A) + \text{det}(A)\text{tr}(A^{-1}dA')
\right)
\text{tr}((A^{-1} - A^{-1}dA' A^{-1})dA) - \text{det}(A) \text{tr}(A^{-1}dA) \\
\text{tr}((A^{-1} - A^{-1}dA' A^{-1})dA)\\
&\quad{-} \text{det}(A) \text{tr}(A^{-1}dA) \\
&=
\text{det}(A) \text{tr}(A^{-1}dA)\\
&+ \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA) \\
&- \text{det}(A)\text{tr}(A^{-1}dA' A^{-1}dA)\\
&- \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA' A^{-1}dA)\\
&- \text{det}(A) \text{tr}(A^{-1}dA) \\
\textcolor{blue}{\text{det}(A) \text{tr}(A^{-1}dA)}\\
&\quad{+} \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA) \\
&\quad{-} \text{det}(A)\text{tr}(A^{-1}dA' A^{-1}dA)\\
&\quad{-} \textcolor{red}{\text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA' A^{-1}dA)}\\
&\quad{-} \textcolor{blue}{\text{det}(A) \text{tr}(A^{-1}dA)} \\
&= \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA) - \text{det}(A)\text{tr}(A^{-1}dA' A^{-1}dA)\\
&+ \text{third order term}
&\quad{+} \textcolor{red}{\text{third order term}}
\end{align*}
$$
So, after dropping the third-order term, we see:
$$
\begin{align*}
f''(A)[dA,dA']
&= \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA)\\
&\quad - \text{det}(A)\text{tr}(A^{-1}dA' A^{-1}dA).
\end{align*}
= \text{det}(A)\text{tr}(A^{-1}dA')\text{tr}(A^{-1}dA) -
\text{det}(A)\text{tr}(A^{-1}dA' A^{-1}dA).
$$

View File

@@ -1,4 +1,4 @@
# Polar Coordinates and Curves
# Polar coordinates and curves
{{< include ../_common_code.qmd >}}
@@ -226,7 +226,7 @@ The folium has radial part $0$ when $\cos(\theta) = 0$ or $\sin(2\theta) = b/4a$
plot_polar(𝒂0..(pi/2-𝒂0), 𝒓)
```
The second - which is too small to appear in the initial plot without zooming in - with
The second---which is too small to appear in the initial plot without zooming in---with
```{julia}

View File

@@ -388,7 +388,7 @@ For a scalar function, Define a *level curve* as the solutions to the equations
contour(xsₛ, ysₛ, zzsₛ)
```
Were one to walk along one of the contour lines, then there would be no change in elevation. The areas of greatest change in elevation - basically the hills - occur where the different contour lines are closest. In this particular area, there is a river that runs from the upper right through to the lower left and this is flanked by hills.
Were one to walk along one of the contour lines, then there would be no change in elevation. The areas of greatest change in elevation---basically the hills--- occur where the different contour lines are closest. In this particular area, there is a river that runs from the upper right through to the lower left and this is flanked by hills.
The $c$ values for the levels drawn may be specified through the `levels` argument:
@@ -636,7 +636,7 @@ This says, informally, for any scale about $L$ there is a "ball" about $C$ (not
In the univariate case, it can be useful to characterize a limit at $x=c$ existing if *both* the left and right limits exist and the two are equal. Generalizing to getting close in $R^m$ leads to the intuitive idea of a limit existing in terms of any continuous "path" that approaches $C$ in the $x$-$y$ plane has a limit and all are equal. Let $\gamma$ describe the path, and $\lim_{s \rightarrow t}\gamma(s) = C$. Then $f \circ \gamma$ will be a univariate function. If there is a limit, $L$, then this composition will also have the same limit as $s \rightarrow t$. Conversely, if for *every* path this composition has the *same* limit, then $f$ will have a limit.
The "two path corollary" is a trick to show a limit does not exist - just find two paths where there is a limit, but they differ, then a limit does not exist in general.
The "two path corollary" is a trick to show a limit does not exist---just find two paths where there is a limit, but they differ, then a limit does not exist in general.
### Continuity of scalar functions
@@ -997,7 +997,7 @@ The figure suggests a potential geometric relationship between the gradient and
We see here how the gradient of $f$, $\nabla{f} = \langle f_{x_1}, f_{x_2}, \dots, f_{x_n} \rangle$, plays a similar role as the derivative does for univariate functions.
First, we consider the role of the derivative for univariate functions. The main characterization - the derivative is the slope of the line that best approximates the function at a point - is quantified by Taylor's theorem. For a function $f$ with a continuous second derivative:
First, we consider the role of the derivative for univariate functions. The main characterization---the derivative is the slope of the line that best approximates the function at a point---is quantified by Taylor's theorem. For a function $f$ with a continuous second derivative:
$$
@@ -1174,7 +1174,7 @@ atand(mean(slopes))
Which seems about right for a generally uphill trail section, as this is.
In the above example, the data is given in terms of a sample, not a functional representation. Suppose instead, the surface was generated by `f` and the path - in the $x$-$y$ plane - by $\gamma$. Then we could estimate the maximum and average steepness by a process like this:
In the above example, the data is given in terms of a sample, not a functional representation. Suppose instead, the surface was generated by `f` and the path---in the $x$-$y$ plane---by $\gamma$. Then we could estimate the maximum and average steepness by a process like this:
```{julia}

View File

@@ -918,7 +918,7 @@ zs = fₗ.(xs, ys)
scatter3d!(xs, ys, zs)
```
A contour plot also shows that some - and only one - extrema happens on the interior:
A contour plot also shows that some---and only one---extrema happens on the interior:
```{julia}
@@ -967,10 +967,10 @@ We confirm this by looking at the Hessian and noting $H_{11} > 0$:
Hₛ = subs.(hessian(exₛ, [x,y]), x=>xstarₛ[x], y=>xstarₛ[y])
```
As it occurs at $(\bar{x}, \bar{y})$ where $\bar{x} = (x_1 + x_2 + x_3)/3$ and $\bar{y} = (y_1+y_2+y_3)/3$ - the averages of the three values - the critical point is an interior point of the triangle.
As it occurs at $(\bar{x}, \bar{y})$ where $\bar{x} = (x_1 + x_2 + x_3)/3$ and $\bar{y} = (y_1+y_2+y_3)/3$---the averages of the three values---the critical point is an interior point of the triangle.
As mentioned by Strang, the real problem is to minimize $d_1 + d_2 + d_3$. A direct approach with `SymPy` - just replacing `d2` above with the square root fails. Consider instead the gradient of $d_1$, say. To avoid square roots, this is taken implicitly from $d_1^2$:
As mentioned by Strang, the real problem is to minimize $d_1 + d_2 + d_3$. A direct approach with `SymPy`---just replacing `d2` above with the square root fails. Consider instead the gradient of $d_1$, say. To avoid square roots, this is taken implicitly from $d_1^2$:
$$
@@ -1016,7 +1016,7 @@ psₛₗ = [a*u for (a,u) in zip(asₛ₁, usₛ)]
plot!(polygon(psₛₗ)...)
```
Let's see where the minimum distance point is by constructing a plot. The minimum must be on the boundary, as the only point where the gradient vanishes is the origin, not in the triangle. The plot of the triangle has a contour plot of the distance function, so we see clearly that the minimum happens at the point `[0.5, -0.866025]`. On this plot, we drew the gradient at some points along the boundary. The gradient points in the direction of greatest increase - away from the minimum. That the gradient vectors have a non-zero projection onto the edges of the triangle in a direction pointing away from the point indicates that the function `d` would increase if moved along the boundary in that direction, as indeed it does.
Let's see where the minimum distance point is by constructing a plot. The minimum must be on the boundary, as the only point where the gradient vanishes is the origin, not in the triangle. The plot of the triangle has a contour plot of the distance function, so we see clearly that the minimum happens at the point `[0.5, -0.866025]`. On this plot, we drew the gradient at some points along the boundary. The gradient points in the direction of greatest increase---away from the minimum. That the gradient vectors have a non-zero projection onto the edges of the triangle in a direction pointing away from the point indicates that the function `d` would increase if moved along the boundary in that direction, as indeed it does.
```{julia}
@@ -1064,7 +1064,7 @@ The smallest value is when $t=0$ or $t=1$, so at one of the points, as `li` is d
##### Example: least squares
We know that two points determine a line. What happens when there are more than two points? This is common in statistics where a bivariate data set (pairs of points $(x,y)$) are summarized through a linear model $\mu_{y|x} = \alpha + \beta x$, That is the average value for $y$ given a particular $x$ value is given through the equation of a line. The data is used to identify what the slope and intercept are for this line. We consider a simple case - $3$ points. The case of $n \geq 3$ being similar.
We know that two points determine a line. What happens when there are more than two points? This is common in statistics where a bivariate data set (pairs of points $(x,y)$) are summarized through a linear model $\mu_{y|x} = \alpha + \beta x$, That is the average value for $y$ given a particular $x$ value is given through the equation of a line. The data is used to identify what the slope and intercept are for this line. We consider a simple case---$3$ points. The case of $n \geq 3$ being similar.
We have a line $l(x) = \alpha + \beta(x)$ and three points $(x_1, y_1)$, $(x_2, y_2)$, and $(x_3, y_3)$. Unless these three points *happen* to be collinear, they can't possibly all lie on the same line. So to *approximate* a relationship by a line requires some inexactness. One measure of inexactness is the *vertical* distance to the line:
@@ -1118,7 +1118,7 @@ As found, the formulas aren't pretty. If $x_1 + x_2 + x_3 = 0$ they simplify. Fo
subs(outₗₛ[β], sum(xₗₛ) => 0)
```
Let $\vec{x} = \langle x_1, x_2, x_3 \rangle$ and $\vec{y} = \langle y_1, y_2, y_3 \rangle$ this is simply $(\vec{x} \cdot \vec{y})/(\vec{x}\cdot \vec{x})$, a formula that will generalize to $n > 3$. The assumption is not a restriction - it comes about by subtracting the mean, $\bar{x} = (x_1 + x_2 + x_3)/3$, from each $x$ term (and similarly subtract $\bar{y}$ from each $y$ term). A process called "centering."
Let $\vec{x} = \langle x_1, x_2, x_3 \rangle$ and $\vec{y} = \langle y_1, y_2, y_3 \rangle$ this is simply $(\vec{x} \cdot \vec{y})/(\vec{x}\cdot \vec{x})$, a formula that will generalize to $n > 3$. The assumption is not a restriction---it comes about by subtracting the mean, $\bar{x} = (x_1 + x_2 + x_3)/3$, from each $x$ term (and similarly subtract $\bar{y}$ from each $y$ term). A process called "centering."
With this observation, the formulas can be re-expressed through:
@@ -1587,7 +1587,7 @@ $$
G(\epsilon_1, \epsilon_2) = L.
$$
Now, Lagrange's method can be employed. This will be fruitful - even though we know the answer - it being $\epsilon_1 = \epsilon_2 = 0$!
Now, Lagrange's method can be employed. This will be fruitful---even though we know the answer---it being $\epsilon_1 = \epsilon_2 = 0$!
Forging ahead, we compute $\nabla{F}$ and $\lambda \nabla{G}$ and set $\epsilon_1 = \epsilon_2 = 0$ where the two are equal. This will lead to a description of $y$ in terms of $y'$.

View File

@@ -111,7 +111,7 @@ Plot of a vector field from $R^2 \rightarrow R^2$ illustrated by drawing curves
To the plot, we added the partial derivatives with respect to $r$ (in red) and with respect to $\theta$ (in blue). These are found with the soon-to-be discussed Jacobian. From the graph, you can see that these vectors are tangent vectors to the drawn curves.
The curves form a non-rectangular grid. Were the cells exactly parallelograms, the area would be computed taking into account the length of the vectors and the angle between them -- the same values that come out of a cross product.
The curves form a non-rectangular grid. Were the cells exactly parallelograms, the area would be computed taking into account the length of the vectors and the angle between them---the same values that come out of a cross product.
## Parametrically defined surfaces
@@ -323,7 +323,7 @@ plt = plot_axes()
We are using the vector of tuples interface (representing points) to specify the curve to draw.
Now we add on some curves for fixed $t$ and then fixed $\theta$ utilizing the fact that `project` returns a tuple of $x$--$y$ values to display.
Now we add on some curves for fixed $t$ and then fixed $\theta$ utilizing the fact that `project` returns a tuple of $x$---$y$ values to display.
```{julia}
for t in range(t₀, tₙ, 20)
@@ -1225,9 +1225,9 @@ q = interpolate(vcat(basic_conditions, new_conds))
plot_q_level_curve(q;layout=(1,2))
```
For this shape, if $b$ increases away from $b_0$, the secant line connecting $(a_0,0)$ and $(b, f(b)$ will have a negative slope, but there are no points nearby $x=c_0$ where the derivative has a tangent line with negative slope, so the continuous function is only on the left side of $b_0$. Mathematically, as $f$ is increasing $c_0$ -- as $f'''(c_0) = 3 > 0$ -- and $f$ is decreasing at $f(b_0)$ -- as $f'(b_0) = -1 < 0$, the signs alone suggest the scenario. The contour plot reveals, not one, but two one-sided functions of $b$ giving $c$.
For this shape, if $b$ increases away from $b_0$, the secant line connecting $(a_0,0)$ and $(b, f(b)$ will have a negative slope, but there are no points nearby $x=c_0$ where the derivative has a tangent line with negative slope, so the continuous function is only on the left side of $b_0$. Mathematically, as $f$ is increasing $c_0$---as $f'''(c_0) = 3 > 0$---and $f$ is decreasing at $f(b_0)$---as $f'(b_0) = -1 < 0$, the signs alone suggest the scenario. The contour plot reveals, not one, but two one-sided functions of $b$ giving $c$.
----
---
Now to characterize all possibilities.
@@ -1291,7 +1291,7 @@ $$
Then $F(c, b) = g_1(b) - g_2(c)$.
By construction, $g_2(c_0) = 0$ and $g_2^{(k)}(c_0) = f^{(k+1)}(c_0)$,
Adjusting $f$ to have a vanishing second -- but not third -- derivative at $c_0$ means $g_2$ will satisfy the assumptions of the lemma assuming $f$ has at least four continuous derivatives (as all our example polynomials do).
Adjusting $f$ to have a vanishing second---but not third---derivative at $c_0$ means $g_2$ will satisfy the assumptions of the lemma assuming $f$ has at least four continuous derivatives (as all our example polynomials do).
As for $g_1$, we have by construction $g_1(b_0) = 0$. By differentiation we get a pattern for some constants $c_j = (j+1)\cdot(j+2)\cdots \cdot k$ with $c_k = 1$.

View File

@@ -981,7 +981,7 @@ $$
\vec{v} \times \vec{c} = GM \hat{x} + \vec{d}.
$$
As $\vec{x}$ and $\vec{v}\times\vec{c}$ lie in the same plane - orthogonal to $\vec{c}$ - so does $\vec{d}$. With a suitable re-orientation, so that $\vec{d}$ is along the $x$ axis, $\vec{c}$ is along the $z$-axis, then we have $\vec{c} = \langle 0,0,c\rangle$ and $\vec{d} = \langle d ,0,0 \rangle$, and $\vec{x} = \langle x, y, 0 \rangle$. Set $\theta$ to be the angle, then $\hat{x} = \langle \cos(\theta), \sin(\theta), 0\rangle$.
As $\vec{x}$ and $\vec{v}\times\vec{c}$ lie in the same plane---orthogonal to $\vec{c}$---so does $\vec{d}$. With a suitable re-orientation, so that $\vec{d}$ is along the $x$ axis, $\vec{c}$ is along the $z$-axis, then we have $\vec{c} = \langle 0,0,c\rangle$ and $\vec{d} = \langle d ,0,0 \rangle$, and $\vec{x} = \langle x, y, 0 \rangle$. Set $\theta$ to be the angle, then $\hat{x} = \langle \cos(\theta), \sin(\theta), 0\rangle$.
Now
@@ -1662,7 +1662,7 @@ $$
The first equation relates the steering angle with the curvature. If the steering angle is not changed ($d\alpha/du=0$) then the curvature is constant and the motion is circular. It will be greater for larger angles (up to $\pi/2$). As the curvature is the reciprocal of the radius, this means the radius of the circular trajectory will be smaller. For the same constant steering angle, the curvature will be smaller for longer wheelbases, meaning the circular trajectory will have a larger radius. For cars, which have similar dynamics, this means longer wheelbase cars will take more room to make a U-turn.
The second equation may be interpreted in ratio of arc lengths. The infinitesimal arc length of the rear wheel is proportional to that of the front wheel only scaled down by $\cos(\alpha)$. When $\alpha=0$ - the bike is moving in a straight line - and the two are the same. At the other extreme - when $\alpha=\pi/2$ - the bike must be pivoting on its rear wheel and the rear wheel has no arc length. This cosine, is related to the speed of the back wheel relative to the speed of the front wheel, which was used in the initial differential equation.
The second equation may be interpreted in ratio of arc lengths. The infinitesimal arc length of the rear wheel is proportional to that of the front wheel only scaled down by $\cos(\alpha)$. When $\alpha=0$---the bike is moving in a straight line---and the two are the same. At the other extreme---when $\alpha=\pi/2$---the bike must be pivoting on its rear wheel and the rear wheel has no arc length. This cosine, is related to the speed of the back wheel relative to the speed of the front wheel, which was used in the initial differential equation.
The last equation, relates the curvature of the back wheel track to the steering angle of the front wheel. When $\alpha=\pm\pi/2$, the rear-wheel curvature, $k$, is infinite, resulting in a cusp (no circle with non-zero radius will approximate the trajectory). This occurs when the front wheel is steered orthogonal to the direction of motion. As was seen in previous graphs of the trajectories, a cusp can happen for quite regular front wheel trajectories.
@@ -1875,7 +1875,7 @@ $$
$$
We see $\vec\beta'$ is zero (the curve is non-regular) when $\kappa'(s) = 0$. The curvature changes from increasing to decreasing, or vice versa at each of the $4$ crossings of the major and minor axes - there are $4$ non-regular points, and we see $4$ cusps in the evolute.
We see $\vec\beta'$ is zero (the curve is non-regular) when $\kappa'(s) = 0$. The curvature changes from increasing to decreasing, or vice versa at each of the $4$ crossings of the major and minor axes--there are $4$ non-regular points, and we see $4$ cusps in the evolute.
The curve parameterized by $\vec{r}(t) = 2(1 - \cos(t)) \langle \cos(t), \sin(t)\rangle$ over $[0,2\pi]$ is cardiod. It is formed by rolling a circle of radius $r$ around another similar sized circle. The following graphically shows the evolute is a smaller cardiod (one-third the size). For fun, the evolute of the evolute is drawn:

View File

@@ -81,7 +81,7 @@ $$
\| \vec{v} \| = \sqrt{ v_1^2 + v_2^2 + \cdots + v_n^2}.
$$
The definition of a norm leads to a few properties. First, if $c$ is a scalar, $\| c\vec{v} \| = |c| \| \vec{v} \|$ - which says scalar multiplication by $c$ changes the length by $|c|$. (Sometimes, scalar multiplication is described as "scaling by....") The other property is an analog of the triangle inequality, in which for any two vectors $\| \vec{v} + \vec{w} \| \leq \| \vec{v} \| + \| \vec{w} \|$. The right hand side is equal only when the two vectors are parallel.
The definition of a norm leads to a few properties. First, if $c$ is a scalar, $\| c\vec{v} \| = |c| \| \vec{v} \|$---which says scalar multiplication by $c$ changes the length by $|c|$. (Sometimes, scalar multiplication is described as "scaling by....") The other property is an analog of the triangle inequality, in which for any two vectors $\| \vec{v} + \vec{w} \| \leq \| \vec{v} \| + \| \vec{w} \|$. The right hand side is equal only when the two vectors are parallel.
A vector with length $1$ is called a *unit* vector. Dividing a non-zero vector by its norm will yield a unit vector, a consequence of the first property above. Unit vectors are often written with a "hat:" $\hat{v}$.
@@ -234,7 +234,7 @@ A simple example might be to add up a sequence of numbers. A direct way might be
x1, x2, x3, x4, x5, x6 = 1, 2, 3, 4, 5, 6
x1 + x2 + x3 + x4 + x5 + x6
```
Someone doesn't need to know `Julia`'s syntax to guess what this computes, save for the idiosyncratic tuple assignment used, which could have been bypassed at the cost of even more typing.
A more efficient means to do, as each component isn't named, this would be to store the data in a container:
@@ -267,7 +267,7 @@ These two functions are *reductions*. There are others, such as `maximum` and `m
reduce(+, xs; init=0) # sum(xs)
```
or
or
```{julia}
reduce(*, xs; init=1) # prod(xs)
@@ -289,9 +289,9 @@ and
foldr(=>, xs)
```
Next, we do a slightly more complicated problem.
Next, we do a slightly more complicated problem.
Recall the distance formula between two points, also called the *norm*. It is written here with the square root on the other side: $d^2 = (x_1-y_1)^2 + (x_0 - y_0)^2$. This computation can be usefully generalized to higher dimensional points (with $n$ components each).
Recall the distance formula between two points, also called the *norm*. It is written here with the square root on the other side: $d^2 = (x_1-y_1)^2 + (x_0 - y_0)^2$. This computation can be usefully generalized to higher dimensional points (with $n$ components each).
This first example shows how the value for $d^2$ can be found using broadcasting and `sum`:
@@ -309,10 +309,10 @@ This formula is a sum after applying an operation to the paired off values. Usin
sum((xi - yi)^2 for (xi, yi) in zip(xs, ys))
```
The `zip` function, used above, produces an iterator over tuples of the paired off values in the two (or more) containers passed to it.
The `zip` function, used above, produces an iterator over tuples of the paired off values in the two (or more) containers passed to it.
This pattern -- where a reduction follows a function's application to the components -- is implemented in `mapreduce`.
This pattern---where a reduction follows a function's application to the components---is implemented in `mapreduce`.
```{julia}
@@ -337,7 +337,7 @@ mapreduce((xi,yi) -> (xi-yi)^2, +, xs, ys)
At times, extracting all but the first or last value can be of interest. For example, a polygon comprised of $n$ points (the vertices), might be stored using a vector for the $x$ and $y$ values with an additional point that mirrors the first. Here are the points:
```{julia}
xs = [1, 3, 4, 2]
xs = [1, 3, 4, 2]
ys = [1, 1, 2, 3]
pts = zip(xs, ys) # recipe for [(x1,y1), (x2,y2), (x3,y3), (x4,y4)]
```
@@ -392,7 +392,7 @@ The `take` method could be used to remove the padded value from the `xs` and `ys
##### Example: Riemann sums
In the computation of a Riemann sum, the interval $[a,b]$ is partitioned using $n+1$ points $a=x_0 < x_1 < \cdots < x_{n-1} < x_n = b$.
In the computation of a Riemann sum, the interval $[a,b]$ is partitioned using $n+1$ points $a=x_0 < x_1 < \cdots < x_{n-1} < x_n = b$.
```{julia}
a, b, n = 0, 1, 4
@@ -414,7 +414,7 @@ sum(f ∘ first, partitions)
```
This uses a few things: like `mapreduce`, `sum` allows a function to
be applied to each element in the `partitions` collection. (Indeed, the default method to compute `sum(xs)` for an arbitrary container resolves to `mapreduce(identity, add_sum, xs)` where `add_sum` is basically `+`.)
be applied to each element in the `partitions` collection. (Indeed, the default method to compute `sum(xs)` for an arbitrary container resolves to `mapreduce(identity, add_sum, xs)` where `add_sum` is basically `+`.)
In this case, the
values come as tuples to the function to apply to each component.
@@ -636,7 +636,7 @@ But the associative property does not make sense, as $(\vec{u} \cdot \vec{v}) \c
## Matrices
Algebraically, the dot product of two vectors - pair off by components, multiply these, then add - is a common operation. Take for example, the general equation of a line, or a plane:
Algebraically, the dot product of two vectors---pair off by components, multiply these, then add---is a common operation. Take for example, the general equation of a line, or a plane:
$$
@@ -764,7 +764,7 @@ Vectors are defined similarly. As they are identified with *column* vectors, we
```{julia}
𝒷 = [10, 11, 12] # not 𝒷 = [10 11 12], which would be a row vector.
a = [10, 11, 12] # not a = [10 11 12], which would be a row vector.
```
In `Julia`, entries in a matrix (or a vector) are stored in a container with a type wide enough accommodate each entry. In this example, the type is SymPy's `Sym` type:
@@ -822,7 +822,7 @@ We can then see how the system of equations is represented with matrices:
```{julia}
M * xs - 𝒷
M * xs - a
```
Here we use `SymPy` to verify the above:
@@ -899,7 +899,7 @@ and
```
:::{.callout-note}
## Note
The adjoint is defined *recursively* in `Julia`. In the `CalculusWithJulia` package, we overload the `'` notation for *functions* to yield a univariate derivative found with automatic differentiation. This can lead to problems: if we have a matrix of functions, `M`, and took the transpose with `M'`, then the entries of `M'` would be the derivatives of the functions in `M` - not the original functions. This is very much likely to not be what is desired. The `CalculusWithJulia` package commits **type piracy** here *and* abuses the generic idea for `'` in Julia. In general type piracy is very much frowned upon, as it can change expected behaviour. It is defined in `CalculusWithJulia`, as that package is intended only to act as a means to ease users into the wider package ecosystem of `Julia`.
The adjoint is defined *recursively* in `Julia`. In the `CalculusWithJulia` package, we overload the `'` notation for *functions* to yield a univariate derivative found with automatic differentiation. This can lead to problems: if we have a matrix of functions, `M`, and took the transpose with `M'`, then the entries of `M'` would be the derivatives of the functions in `M`---not the original functions. This is very much likely to not be what is desired. The `CalculusWithJulia` package commits **type piracy** here *and* abuses the generic idea for `'` in Julia. In general type piracy is very much frowned upon, as it can change expected behaviour. It is defined in `CalculusWithJulia`, as that package is intended only to act as a means to ease users into the wider package ecosystem of `Julia`.
:::
---
@@ -1081,7 +1081,7 @@ norm(u₂ × v₂)
---
This analysis can be extended to the case of 3 vectors, which - when not co-planar - will form a *parallelepiped*.
This analysis can be extended to the case of 3 vectors, which---when not co-planar---will form a *parallelepiped*.
```{julia}

View File

@@ -77,7 +77,9 @@ These notes may be compiled into a `pdf` file through Quarto. As the result is r
-->
To *contribute* -- say by suggesting additional topics, correcting a
mistake, or fixing a typo -- click the "Edit this page" link and join the list of [contributors](https://github.com/jverzani/CalculusWithJuliaNotes.jl/graphs/contributors). Thanks to all contributors and a *very* special thanks to `@fangliu-tju` for their careful and most-appreciated proofreading.
mistake, or fixing a typo -- click the "Edit this page" link and join the list of [contributors](https://github.com/jverzani/CalculusWithJuliaNotes.jl/graphs/contributors). Thanks to all contributors.
A *very* special thanks goes out to `@fangliu-tju` for their careful and most-appreciated proofreading and error spotting spread over a series of PRs.
## Running Julia

View File

@@ -1,4 +1,4 @@
# The Gradient, Divergence, and Curl
# The gradient, divergence, and curl
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# Line and Surface Integrals
# Line and surface integrals
{{< include ../_common_code.qmd >}}
@@ -340,7 +340,7 @@ W = integrate(F(r(t)) ⋅ T(r(t)), (t, 0, 2PI))
There are technical assumptions about curves and regions that are necessary for some statements to be made:
* Let $C$ be a [Jordan](https://en.wikipedia.org/wiki/Jordan_curve_theorem) curve - a non-self-intersecting continuous loop in the plane. Such a curve divides the plane into two regions, one bounded and one unbounded. The normal to a Jordan curve is assumed to be in the direction of the unbounded part.
* Let $C$ be a [Jordan](https://en.wikipedia.org/wiki/Jordan_curve_theorem) curve---a non-self-intersecting continuous loop in the plane. Such a curve divides the plane into two regions, one bounded and one unbounded. The normal to a Jordan curve is assumed to be in the direction of the unbounded part.
* Further, we will assume that our curves are *piecewise smooth*. That is comprised of finitely many smooth pieces, continuously connected.
* The region enclosed by a closed curve has an *interior*, $D$, which we assume is an *open* set (one for which every point in $D$ has some "ball" about it entirely within $D$ as well.)
* The region $D$ is *connected* meaning between any two points there is a continuous path in $D$ between the two points.
@@ -471,7 +471,7 @@ The flow integral is typically computed for a closed (Jordan) curve, measuring t
:::{.callout-note}
## Note
For a Jordan curve, the positive orientation of the curve is such that the normal direction (proportional to $\hat{T}'$) points away from the bounded interior. For a non-closed path, the choice of parameterization will determine the normal and the integral for flow across a curve is dependent - up to its sign - on this choice.
For a Jordan curve, the positive orientation of the curve is such that the normal direction (proportional to $\hat{T}'$) points away from the bounded interior. For a non-closed path, the choice of parameterization will determine the normal and the integral for flow across a curve is dependent---up to its sign---on this choice.
:::

View File

@@ -1,4 +1,4 @@
# Quick Review of Vector Calculus
# Quick review of vector calculus
{{< include ../_common_code.qmd >}}
@@ -133,7 +133,7 @@ $$
$$
The generalization to $n>2$ is clear - the partial derivative in $x_i$ is the derivative of $f$ when the *other* $x_j$ are held constant.
The generalization to $n>2$ is clear---the partial derivative in $x_i$ is the derivative of $f$ when the *other* $x_j$ are held constant.
This may be viewed as the derivative of the univariate function $(f\circ\vec{r})(t)$ where $\vec{r}(t) = p + t \hat{e}_i$, $\hat{e}_i$ being the unit vector of all $0$s except a $1$ in the $i$th component.

View File

@@ -1,4 +1,4 @@
# Green's Theorem, Stokes' Theorem, and the Divergence Theorem
# Green's theorem, Stokes' theorem, and the divergence theorem
{{< include ../_common_code.qmd >}}
@@ -721,13 +721,13 @@ The fluid would flow along the blue (stream) lines. The red lines have equal pot
# https://en.wikipedia.org/wiki/Jiffy_Pop#/media/File:JiffyPop.jpg
imgfile ="figures/jiffy-pop.png"
caption ="""
The Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary - where the surface meets the pan - stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes' theorem will relate a surface integral over the surface to a line integral about the bounding curve.
The Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary---where the surface meets the pan---stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes' theorem will relate a surface integral over the surface to a line integral about the bounding curve.
"""
# ImageFile(:integral_vector_calculus, imgfile, caption)
nothing
```
![The Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary - where the surface meets the pan - stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes' theorem will relate a surface integral over the surface to a line integral about the bounding curve.
![The Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary---where the surface meets the pan---stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes' theorem will relate a surface integral over the surface to a line integral about the bounding curve.
](./figures/jiffy-pop.png)
Were the figure of Jiffy Pop popcorn animated, the surface of foil would slowly expand due to pressure of popping popcorn until the popcorn was ready. However, the boundary would remain the same. Many different surfaces can have the same boundary. Take for instance the upper half unit sphere in $R^3$ it having the curve $x^2 + y^2 = 1$ as a boundary curve. This is the same curve as the surface of the cone $z = 1 - (x^2 + y^2)$ that lies above the $x-y$ plane. This would also be the same curve as the surface formed by a Mickey Mouse glove if the collar were scaled and positioned onto the unit circle.
@@ -761,7 +761,7 @@ $$
$$
In terms of our expanding popcorn, the boundary integral - after accounting for cancellations, as in Green's theorem - can be seen as a microscopic sum of boundary integrals each of which is approximated by a term $\nabla\times{F}\cdot\hat{N} \Delta{S}$ which is viewed as a Riemann sum approximation for the the integral of the curl over the surface. The cancellation depends on a proper choice of orientation, but with that we have:
In terms of our expanding popcorn, the boundary integral---after accounting for cancellations, as in Green's theorem---can be seen as a microscopic sum of boundary integrals each of which is approximated by a term $\nabla\times{F}\cdot\hat{N} \Delta{S}$ which is viewed as a Riemann sum approximation for the the integral of the curl over the surface. The cancellation depends on a proper choice of orientation, but with that we have:
::: {.callout-note icon=false}
## Stokes' theorem

View File

@@ -557,7 +557,7 @@ Following (faithfully) [Kantorwitz and Neumann](https://www.researchgate.net/pub
@fig-kantorwitz-neumann is clearly of a concave down function. The asymmetry about the critical point will be seen to be a result of the derivative also being concave down. This asymmetry will be characterized in several different ways in the following including showing that the arc length from $(a,0)$ to $(c,f(c))$ is longer than from $(c,f(c))$ to $(b,0)$.
::: {#@fig-kantorwitz-neumann}
::: {#fig-kantorwitz-neumann}
```{julia}

View File

@@ -16,7 +16,7 @@ using Roots
---
![A jigsaw puzzle needs a certain amount of area to complete. For a traditional rectangular puzzle, this area is comprised of the sum of the areas for each piece. Decomposing a total area into the sum of smaller, known, ones--even if only approximate--is the basis of definite integration.](figures/jigsaw.png)
![A jigsaw puzzle needs a certain amount of area to complete. For a traditional rectangular puzzle, this area is comprised of the sum of the areas for each piece. Decomposing a total area into the sum of smaller, known, ones---even if only approximate---is the basis of definite integration.](figures/jigsaw.png)
The question of area has long fascinated human culture. As children, we learn early on the formulas for the areas of some geometric figures: a square is $b^2$, a rectangle $b\cdot h$, a triangle $1/2 \cdot b \cdot h$ and for a circle, $\pi r^2$. The area of a rectangle is often the intuitive basis for illustrating multiplication. The area of a triangle has been known for ages. Even complicated expressions, such as [Heron's](http://tinyurl.com/mqm9z) formula which relates the area of a triangle with measurements from its perimeter have been around for 2000 years. The formula for the area of a circle is also quite old. Wikipedia dates it as far back as the [Rhind](http://en.wikipedia.org/wiki/Rhind_Mathematical_Papyrus) papyrus for 1700 BC, with the approximation of $256/81$ for $\pi$.
@@ -1067,7 +1067,7 @@ plot!(zero)
We could add the signed area over $[0,1]$ to the above, but instead see a square of area $1$, a triangle with area $1/2$ and a triangle with signed area $-1$. The total is then $1/2$.
This figure--using equal sized axes--may make the above decomposition more clear:
This figure---using equal sized axes---may make the above decomposition more clear:
```{julia}
#| echo: false

View File

@@ -448,7 +448,7 @@ When doing problems by hand this latter style can often reduce the complications
Consider two overlapping circles, one with smaller radius. How much area is in the larger circle that is not in the smaller? The question came up on the `Julia` [discourse](https://discourse.julialang.org/t/is-there-package-or-method-to-calculate-certain-area-in-julia-symbolically-with-sympy/99751) discussion board. A solution, modified from an answer of `@rocco_sprmnt21`, follows.
Without losing too-much generality, we can consider the smaller circle to have radius $a$, the larger circle to have radius $b$ and centered at $(0,c)$.
We assume some overlap -- $a \ge c-b$, but not too much -- $c-b \ge 0$ or $0 \le c-b \le a$.
We assume some overlap---$a \ge c-b$, but not too much---$c-b \ge 0$ or $0 \le c-b \le a$.
```{julia}
@syms x::real y::real a::positive b::positive c::positive

View File

@@ -1,4 +1,4 @@
# Center of Mass
# Center of mass
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# Improper Integrals
# Improper integrals
{{< include ../_common_code.qmd >}}

View File

@@ -1,4 +1,4 @@
# Integration By Parts
# Integration by parts
{{< include ../_common_code.qmd >}}
@@ -116,7 +116,7 @@ $$
$B$ is similar with the roles of $u$ and $v$ reversed.
----
---
Informally, the integration by parts formula is sometimes seen as $\int udv = uv - \int v du$, as well can be somewhat confusingly written as:
@@ -382,7 +382,7 @@ Recall, just using *either* $x_i$ or $x_{i-1}$ for $c_i$ gives an error that is
This [proof](http://www.math.ucsd.edu/~ebender/20B/77_Trap.pdf) for the error estimate is involved, but is reproduced here, as it nicely integrates many of the theoretical concepts of integration discussed so far.
First, for convenience, we consider the interval $x_i$ to $x_i+h$. The actual answer over this is just $\int_{x_i}^{x_i+h}f(x) dx$. By a $u$-substitution with $u=x-x_i$ this becomes $\int_0^h f(t + x_i) dt$. For analyzing this we integrate once by parts using $u=f(t+x_i)$ and $dv=dt$. But instead of letting $v=t$, we choose to add--as is our prerogative--a constant of integration $A$, so $v=t+A$:
First, for convenience, we consider the interval $x_i$ to $x_i+h$. The actual answer over this is just $\int_{x_i}^{x_i+h}f(x) dx$. By a $u$-substitution with $u=x-x_i$ this becomes $\int_0^h f(t + x_i) dt$. For analyzing this we integrate once by parts using $u=f(t+x_i)$ and $dv=dt$. But instead of letting $v=t$, we choose to add---as is our prerogative---a constant of integration $A$, so $v=t+A$:
$$

View File

@@ -1,4 +1,4 @@
# Partial Fractions
# Partial fractions
{{< include ../_common_code.qmd >}}
@@ -14,7 +14,7 @@ using SymPy
Integration is facilitated when an antiderivative for $f$ can be found, as then definite integrals can be evaluated through the fundamental theorem of calculus.
However, despite differentiation being an algorithmic procedure, integration is not. There are "tricks" to try, such as substitution and integration by parts. These work in some cases--but not all!
However, despite differentiation being an algorithmic procedure, integration is not. There are "tricks" to try, such as substitution and integration by parts. These work in some cases---but not all!
However, there are classes of functions for which algorithms exist. For example, the `SymPy` `integrate` function mostly implements an algorithm that decides if an elementary function has an antiderivative. The [elementary](http://en.wikipedia.org/wiki/Elementary_function) functions include exponentials, their inverses (logarithms), trigonometric functions, their inverses, and powers, including $n$th roots. Not every elementary function will have an antiderivative comprised of (finite) combinations of elementary functions. The typical example is $e^{x^2}$, which has no simple antiderivative, despite its ubiquitousness.

View File

@@ -1,4 +1,4 @@
# Surface Area
# Surface area
{{< include ../_common_code.qmd >}}

View File

@@ -14,7 +14,7 @@ using LaTeXStrings
gr();
```
----
---
In the March 2003 issue of the College Mathematics Journal, Leon M Hall posed 12 questions related to the following figure:
@@ -80,7 +80,7 @@ zs = solve(f(x) ~ nl, x)
q = only(filter(!=(a), zs))
```
----
---
The first question is simply:
@@ -115,7 +115,7 @@ In the remaining examples we don't show the code by default.
:::
----
---
> 1b. The length of the line segment $PQ$
@@ -133,7 +133,7 @@ lseg = sqrt((f(a) - f(q))^2 + (a - q)^2);
```
----
---
> 2a. The horizontal distance between $P$ and $Q$
@@ -151,7 +151,7 @@ plot!([q₀, a₀], [f(a₀), f(a₀)], linewidth=5)
hd = a - q;
```
----
---
> 2b. The area of the parabolic segment
@@ -172,7 +172,7 @@ plot!(xs, ys, fill=(:green, 0.25, 0))
A = simplify(integrate(nl - f(x), (x, q, a)));
```
----
---
> 2c. The volume of the rotated solid formed by revolving the parabolic segment around the vertical line $k$ units to the right of $P$ or to the left of $Q$ where $k > 0$.
@@ -185,7 +185,7 @@ A = simplify(integrate(nl - f(x), (x, q, a)));
V = simplify(integrate(2PI*(nl-f(x))*(a - x + k),(x, q, a)));
```
----
---
> 3. The $y$ coordinate of the centroid of the parabolic segment
@@ -214,7 +214,7 @@ yₘ = integrate( (1//2) * (nl^2 - f(x)^2), (x, q, a)) / A
yₘ = simplify(yₘ);
```
----
---
> 4. The length of the arc of the parabola between $P$ and $Q$
@@ -233,7 +233,7 @@ p
L = integrate(sqrt(1 + fp(x)^2), (x, q, a));
```
----
---
> 5. The $y$ coordinate of the midpoint of the line segment $PQ$
@@ -254,7 +254,7 @@ p
mp = nl(x => (a + q)/2);
```
----
---
> 6. The area of the trapezoid bound by the normal line, the $x$-axis, and the vertical lines through $P$ and $Q$.
@@ -273,7 +273,7 @@ p
trap = 1//2 * (f(q) + f(a)) * (a - q);
```
----
---
> 7. The area bounded by the parabola and the $x$ axis and the vertical lines through $P$ and $Q$
@@ -295,7 +295,7 @@ p
pa = integrate(x^2, (x, q, a));
```
----
---
> 8. The area of the surface formed by revolving the arc of the parabola between $P$ and $Q$ around the vertical line through $P$
@@ -321,7 +321,7 @@ vv(x) = f(a - uu(x))
SA = 2PI * integrate(uu(x) * sqrt(diff(uu(x),x)^2 + diff(vv(x),x)^2), (x, q, a));
```
----
---
> 9. The height of the parabolic segment (i.e. the distance between the normal line and the tangent line to the parabola that is parallel to the normal line)
@@ -350,7 +350,7 @@ segment_height = sqrt((b-b)^2 + (f(b) - nl(x=>b))^2);
```
----
---
> 10. The volume of the solid formed by revolving the parabolic segment around the $x$-axis
@@ -371,7 +371,7 @@ end
Vₓ = integrate(pi * (nl^2 - f(x)^2), (x, q, a));
```
----
---
> 11. The area of the triangle bound by the normal line, the vertical line through $Q$ and the $x$-axis
@@ -392,7 +392,7 @@ plot!([p₀,q₀,q₀,p₀], [0,f(q₀),0,0];
triangle = 1/2 * f(q) * (a - f(a)/(-1/fp(a)) - q);
```
----
---
> 12. The area of the quadrilateral bound by the normal line, the tangent line, the vertical line through $Q$ and the $x$-axis
@@ -417,7 +417,7 @@ x₁,x₂,x₃,x₄ = (a,q,q,tl₀)
y₁, y₂, y₃, y₄ = (f(a), f(q), 0, 0)
quadrilateral = (x₁ - x₂)*(y₁ - y₃)/2 - (x₁ - x₃)*(y₁ - y₂)/2 + (x₁ - x₃)*(y₁ - y₄)/2 - (x₁ - x₄)*(y₁ - y₃)/2;
```
----
---
The answers appear here in sorted order, some given as approximate floating point values:

View File

@@ -315,7 +315,7 @@ find_zero(q, (5, 10))
::: {.callout-note}
### Between need not be near
Later, we will see more efficient algorithms to find a zero *near* a given guess. The bisection method finds a zero *between* two values of a bracketing interval. This interval need not be small. Indeed in many cases it can be infinite. For this particular problem, any interval like `(2,N)` will work as long as `N` is bigger than the zero and small enough that `q(N)` is finite *or* infinite *but* not `NaN`. (Basically, `q` must evaluate to a number with a sign. Here, the value of `q(Inf)` is `NaN` as it evaluates to the indeterminate `Inf - Inf`. But `q` is still not `NaN` for quite large numbers, such as `1e77`, as `x^4` can as big as `1e308` -- technically `floatmax(Float64)` -- and be finite.)
Later, we will see more efficient algorithms to find a zero *near* a given guess. The bisection method finds a zero *between* two values of a bracketing interval. This interval need not be small. Indeed in many cases it can be infinite. For this particular problem, any interval like `(2,N)` will work as long as `N` is bigger than the zero and small enough that `q(N)` is finite *or* infinite *but* not `NaN`. (Basically, `q` must evaluate to a number with a sign. Here, the value of `q(Inf)` is `NaN` as it evaluates to the indeterminate `Inf - Inf`. But `q` is still not `NaN` for quite large numbers, such as `1e77`, as `x^4` can as big as `1e308`---technically `floatmax(Float64)`---and be finite.)
:::
@@ -840,7 +840,7 @@ plotly()
nothing
```
Figure illustrating absolute and relative minima for a function $f(x)$ over $I=[a,b]$. The leftmost point has a $y$ value, $f(a)$, which is an absolute maximum of $f(x)$ over $I$. The three points highlighted between $a$ and $b$ are all relative extrema. The first one is *also* the absolute minimum over $I$. The endpoint is not considered a relative maximum for technical reasons --- there is no interval around $b$, it being on the boundary of $I$.
Figure illustrating absolute and relative minima for a function $f(x)$ over $I=[a,b]$. The leftmost point has a $y$ value, $f(a)$, which is an absolute maximum of $f(x)$ over $I$. The three points highlighted between $a$ and $b$ are all relative extrema. The first one is *also* the absolute minimum over $I$. The endpoint is not considered a relative maximum for technical reasons---there is no interval around $b$, it being on the boundary of $I$.
:::

View File

@@ -689,7 +689,7 @@ c = 15/11
lim(h, c; n = 16)
```
(Though the graph and table do hint at something a bit odd -- the graph shows a blip, the table doesn't show values in the second column going towards a specific value.)
(Though the graph and table do hint at something a bit odd---the graph shows a blip, the table doesn't show values in the second column going towards a specific value.)
However the limit in this case is $-\infty$ (or DNE), as there is an aysmptote at $c=15/11$. The problem is the asymptote due to the logarithm is extremely narrow and happens between floating point values to the left and right of $15/11$.

View File

@@ -1,7 +1,3 @@
---
engine: julia
---
# Precalculus Concepts
The mathematical topics in this chapter come from pre-calculus. However, much of the `Julia` usage needed for the rest of the notes are introduced.

View File

@@ -22,7 +22,7 @@ The family of exponential functions is used to model growth and decay. The famil
## Exponential functions
The family of exponential functions is defined by $f(x) = a^x, -\infty< x < \infty$ and $a > 0$. For $0 < a < 1$ these functions decay or decrease, for $a > 1$ the functions grow or increase, and if $a=1$ the function is constantly $1$.
The family of exponential functions is defined by $f(x) = a^x, -\infty< x < \infty$ and $a > 0$. For $0 < a < 1$ these functions decay or decrease, for $a > 1$ these functions grow or increase, and if $a=1$ the function is constantly $1$.
For a given $a$, defining $a^n$ for positive integers is straightforward, as it means multiplying $n$ copies of $a.$ From this, for *integer powers*, the key properties of exponents: $a^x \cdot a^y = a^{x+y}$, and $(a^x)^y = a^{x \cdot y}$ are immediate consequences. For example with $x=3$ and $y=2$:
@@ -114,7 +114,7 @@ t2, t8 = 72/2, 72/8
exp(r2*t2), exp(r8*t8)
```
So fairly close - after $72/r$ years the amount is $2.05...$ times more than the initial amount.
So fairly close---after $72/r$ years the amount is $2.05...$ times more than the initial amount.
##### Example
@@ -259,7 +259,7 @@ The inverse function will solve for $x$ in the equation $a^x = y$. The answer, f
That is $a^{\log_a(x)} = x$ for $x > 0$ and $\log_a(a^x) = x$ for all $x$.
To see how a logarithm is mathematically defined will have to wait, though the family of functions - one for each $a>0$ - are implemented in `Julia` through the function `log(a,x)`. There are special cases requiring just one argument: `log(x)` will compute the natural log, base $e$ - the inverse of $f(x) = e^x$; `log2(x)` will compute the log base $2$ - the inverse of $f(x) = 2^x$; and `log10(x)` will compute the log base $10$ - the inverse of $f(x)=10^x$. (Also `log1p` computes an accurate value of $\log(1 + p)$ when $p \approx 0$.)
To see how a logarithm is mathematically defined will have to wait, though the family of functions---one for each $a>0$---are implemented in `Julia` through the function `log(a,x)`. There are special cases requiring just one argument: `log(x)` will compute the natural log, base $e$---the inverse of $f(x) = e^x$; `log2(x)` will compute the log base $2$---the inverse of $f(x) = 2^x$; and `log10(x)` will compute the log base $10$- the inverse of $f(x)=10^x$. (Also `log1p` computes an accurate value of $\log(1 + p)$ when $p \approx 0$.)
To see this in an example, we plot for base $2$ the exponential function $f(x)=2^x$, its inverse, and the logarithm function with base $2$:
@@ -398,7 +398,7 @@ $$
##### Example
Before the ubiquity of electronic calculating devices, the need to compute was still present. Ancient civilizations had abacuses to make addition easier. For multiplication and powers a [slide rule](https://en.wikipedia.org/wiki/Slide_rule) could be used. It is easy to represent addition physically with two straight pieces of wood - just represent a number with a distance and align the two pieces so that the distances are sequentially arranged. To multiply then was as easy: represent the logarithm of a number with a distance then add the logarithms. The sum of the logarithms is the logarithm of the *product* of the original two values. Converting back to a number answers the question. The conversion back and forth is done by simply labeling the wood using a logartithmic scale. The slide rule was [invented](http://tinyurl.com/qytxo3e) soon after Napier's initial publication on the logarithm in 1614.
Before the ubiquity of electronic calculating devices, the need to compute was still present. Ancient civilizations had abacuses to make addition easier. For multiplication and powers a [slide rule](https://en.wikipedia.org/wiki/Slide_rule) could be used. It is easy to represent addition physically with two straight pieces of wood---just represent a number with a distance and align the two pieces so that the distances are sequentially arranged. To multiply then was as easy: represent the logarithm of a number with a distance then add the logarithms. The sum of the logarithms is the logarithm of the *product* of the original two values. Converting back to a number answers the question. The conversion back and forth is done by simply labeling the wood using a logartithmic scale. The slide rule was [invented](http://tinyurl.com/qytxo3e) soon after Napier's initial publication on the logarithm in 1614.
##### Example

View File

@@ -240,13 +240,13 @@ f(x) = 5/9 * (x - 32)
f(72) ## room temperature
```
will create a function object with a value of `x` determined at a later time - the time the function is called. So the value of `x` defined when the function is created is not important here (as the value of `x` used by `f` is passed in as an argument).
will create a function object with a value of `x` determined at a later time---the time the function is called. So the value of `x` defined when the function is created is not important here (as the value of `x` used by `f` is passed in as an argument).
Within `Julia`, we make note of the distinction between a function object versus a function call. In the definition `f(x)=cos(x)`, the variable `f` refers to a function object, whereas the expression `f(pi)` is a function call, resulting in a value. This mirrors the math notation where an $f$ is used when properties of a function are being emphasized (such as $f \circ g$ for composition) and $f(x)$ is used when the values related to the function are being emphasized (such as saying "the plot of the equation $y=f(x)$).
Distinguishing these related but different concepts --- expressions, equations, values from function calls, and function objects --- is important when modeling mathematics on the computer.
Distinguishing these related but different concepts---expressions, equations, values from function calls, and function objects---is important when modeling mathematics on the computer.
::: {#fig-kidney}
@@ -315,7 +315,7 @@ s(x) =
\end{cases}
$$
We learn to read this as: when $x$ is less than $0$, then the answer is $-1$. If $x$ is greater than $0$ the answer is $1.$ Often--but not in this example--there is an "otherwise" case to catch those values of $x$ that are not explicitly mentioned. As there is no such "otherwise" case here, we can see that this function has no definition when $x=0$. This function is often called the "sign" function and is also defined by $\lvert x\rvert/x$. (`Julia`'s `sign` function defines `sign(0)` to be `0`.)
We learn to read this as: when $x$ is less than $0$, then the answer is $-1$. If $x$ is greater than $0$ the answer is $1.$ Often---but not in this example---there is an "otherwise" case to catch those values of $x$ that are not explicitly mentioned. As there is no such "otherwise" case here, we can see that this function has no definition when $x=0$. This function is often called the "sign" function and is also defined by $\lvert x\rvert/x$. (`Julia`'s `sign` function defines `sign(0)` to be `0`.)
How do we create conditional statements in `Julia`? Programming languages generally have "if-then-else" constructs to handle conditional evaluation. In `Julia`, the following code will handle the above condition:
@@ -357,7 +357,7 @@ For example, here is one way to define an absolute value function:
abs_val(x) = x >= 0 ? x : -x
```
The condition is `x >= 0`--or is `x` non-negative? If so, the value `x` is used, otherwise `-x` is used.
The condition is `x >= 0`---or is `x` non-negative? If so, the value `x` is used, otherwise `-x` is used.
Here is a means to implement a function which takes the larger of `x` or `10`:
@@ -672,7 +672,7 @@ During this call, values for `m` and `b` are found from how the function is call
mxplusb(0; m=3, b=2)
```
Keywords are used to mark the parameters whose values are to be changed from the default. Though one can use *positional arguments* for parameters--and there are good reasons to do so--using keyword arguments is a good practice if performance isn't paramount, as their usage is more explicit yet the defaults mean that a minimum amount of typing needs to be done.
Keywords are used to mark the parameters whose values are to be changed from the default. Though one can use *positional arguments* for parameters---and there are good reasons to do so---using keyword arguments is a good practice if performance isn't paramount, as their usage is more explicit yet the defaults mean that a minimum amount of typing needs to be done.
Keyword arguments are widely used with plotting commands, as there are numerous options to adjust, but typically only a handful adjusted per call. The `Plots` package whose commands we illustrate throughout these notes starting with the next section has this in its docs: `Plots.jl` follows two simple rules with data and attributes:
@@ -726,7 +726,7 @@ The style isn't so different from using keyword arguments, save the extra step o
v0, theta
```
The *big* advantage of bundling parameters into a container is consistency the function is always called in an identical manner regardless of the number of parameters (or variables).
The *big* advantage of bundling parameters into a container is consistency--the function is always called in an identical manner regardless of the number of parameters (or variables).
::: {.callout-note}
@@ -749,13 +749,13 @@ Volume(r, h) = pi * r^2 * h # of a cylinder
SurfaceArea(r, h) = pi * r * (r + sqrt(h^2 + r^2)) # of a right circular cone, including the base
```
The right-hand sides may or may not be familiar, but it should be reasonable to believe that if push came to shove, the formulas could be looked up. However, the left-hand sides are subtly different - they have two arguments, not one. In `Julia` it is trivial to define functions with multiple arguments - we just did.
The right-hand sides may or may not be familiar, but it should be reasonable to believe that if push came to shove, the formulas could be looked up. However, the left-hand sides are subtly different---they have two arguments, not one. In `Julia` it is trivial to define functions with multiple arguments---we just did.
Earlier we saw the `log` function can use a second argument to express the base. This function is basically defined by `log(b,x)=log(x)/log(b)`. The `log(x)` value is the natural log, and this definition just uses the change-of-base formula for logarithms.
But not so fast, on the left side is a function with two arguments and on the right side the functions have one argument--yet they share the same name. How does `Julia` know which to use? `Julia` uses the number, order, and *type* of the positional arguments passed to a function to determine which function definition to use. This is technically known as [multiple dispatch](http://en.wikipedia.org/wiki/Multiple_dispatch) or **polymorphism**. As a feature of the language, it can be used to greatly simplify the number of functions the user must learn. The basic idea is that many functions are "generic" in that they have methods which will work differently in different scenarios.
But not so fast, on the left side is a function with two arguments and on the right side the functions have one argument---yet they share the same name. How does `Julia` know which to use? `Julia` uses the number, order, and *type* of the positional arguments passed to a function to determine which function definition to use. This is technically known as [multiple dispatch](http://en.wikipedia.org/wiki/Multiple_dispatch) or **polymorphism**. As a feature of the language, it can be used to greatly simplify the number of functions the user must learn. The basic idea is that many functions are "generic" in that they have methods which will work differently in different scenarios.
:::{.callout-warning}
@@ -785,7 +785,7 @@ twotox(x::Real) = (2.0)^x
twotox(x::Complex) = (2.0 + 0.0im)^x
```
This is for illustration purposes -- the latter two are actually already done through `Julia`'s *promotion* mechanism -- but we see that `twotox` will return a rational number when `x` is an integer unlike `Julia` which, when `x` is non-negative will return an integer and will otherwise will error or return a float (when `x` is a numeric literal, like `2^(-3)`).
This is for illustration purposes---the latter two are actually already done through `Julia`'s *promotion* mechanism---but we see that `twotox` will return a rational number when `x` is an integer unlike `Julia` which, when `x` is non-negative will return an integer and will otherwise will error or return a float (when `x` is a numeric literal, like `2^(-3)`).
The key to reading the above is the type annotation acts like a gatekeeper allowing in only variables of that type or a subtype of that type.
@@ -811,7 +811,7 @@ Representing the area of a rectangle in terms of two variables is easy, as the f
Area(w, h) = w * h
```
But the other fact about this problem--that the perimeter is $20$--means that height depends on width. For this question, we can see that $P=2w + 2h$ so that--as a function--`height` depends on `w` as follows:
But the other fact about this problem---that the perimeter is $20$---means that height depends on width. For this question, we can see that $P=2w + 2h$ so that---as a function---`height` depends on `w` as follows:
```{julia}
@@ -858,7 +858,7 @@ $$
g(x) = f(x-c)
$$
has an interpretation - the graph of $g$ will be the same as the graph of $f$ shifted to the right by $c$ units. That is $g$ is a transformation of $f$. From one perspective, the act of replacing $x$ with $x-c$ transforms a function into a new function. Mathematically, when we focus on transforming functions, the word [operator](http://en.wikipedia.org/wiki/Operator_%28mathematics%29) is sometimes used. This concept of transforming a function can be viewed as a certain type of function, in an abstract enough way. The relation would be to just pair off the functions $(f,g)$ where $g(x) = f(x-c)$.
has an interpretation---the graph of $g$ will be the same as the graph of $f$ shifted to the right by $c$ units. That is $g$ is a transformation of $f$. From one perspective, the act of replacing $x$ with $x-c$ transforms a function into a new function. Mathematically, when we focus on transforming functions, the word [operator](http://en.wikipedia.org/wiki/Operator_%28mathematics%29) is sometimes used. This concept of transforming a function can be viewed as a certain type of function, in an abstract enough way. The relation would be to just pair off the functions $(f,g)$ where $g(x) = f(x-c)$.
With `Julia` we can represent such operations. The simplest thing would be to do something like:
@@ -881,7 +881,7 @@ function shift_right(f; c=0)
end
```
That takes some parsing. In the body of the `shift_right` is the definition of a function. But this function has no name-it is *anonymous*. But what it does should be clear--it subtracts $c$ from $x$ and evaluates $f$ at this new value. Since the last expression creates a function, this function is returned by `shift_right`.
That takes some parsing. In the body of the `shift_right` is the definition of a function. But this function has no name-it is *anonymous*. But what it does should be clear---it subtracts $c$ from $x$ and evaluates $f$ at this new value. Since the last expression creates a function, this function is returned by `shift_right`.
So we could have done something more complicated like:

View File

@@ -1,4 +1,4 @@
# The Inverse of a Function
# The inverse of a function
{{< include ../_common_code.qmd >}}
@@ -25,7 +25,7 @@ We may conceptualize such a relation in many ways:
* through a description of what $f$ does;
* or through a table of paired values, say.
For the moment, let's consider a function as a rule that takes in a value of $x$ and outputs a value $y$. If a rule is given defining the function, the computation of $y$ is straightforward. A different question is not so easy: for a given value $y$ what value--or *values*--of $x$ (if any) produce an output of $y$? That is, what $x$ value(s) satisfy $f(x)=y$?
For the moment, let's consider a function as a rule that takes in a value of $x$ and outputs a value $y$. If a rule is given defining the function, the computation of $y$ is straightforward. A different question is not so easy: for a given value $y$ what value---or *values*---of $x$ (if any) produce an output of $y$? That is, what $x$ value(s) satisfy $f(x)=y$?
*If* for each $y$ in some set of values there is just one $x$ value, then this operation associates to each value $y$ a single value $x$, so it too is a function. When that is the case we call this an *inverse* function.
@@ -202,7 +202,7 @@ In the section on the [intermediate value theorem](../limits/intermediate_value_
## Functions which are not always invertible
Consider the function $f(x) = x^2$. The graph--a parabola--is clearly not *monotonic*. Hence no inverse function exists. Yet, we can solve equations $y=x^2$ quite easily: $y=\sqrt{x}$ *or* $y=-\sqrt{x}$. We know the square root undoes the squaring, but we need to be a little more careful to say the square root is the inverse of the squaring function.
Consider the function $f(x) = x^2$. The graph---a parabola---is clearly not *monotonic*. Hence no inverse function exists. Yet, we can solve equations $y=x^2$ quite easily: $y=\sqrt{x}$ *or* $y=-\sqrt{x}$. We know the square root undoes the squaring, but we need to be a little more careful to say the square root is the inverse of the squaring function.
The issue is there are generally *two* possible answers. To avoid this, we might choose to only take the *non-negative* answer. To make this all work as above, we restrict the domain of $f(x)$ and now consider the related function $f(x)=x^2, x \geq 0$. This is now a monotonic function, so will have an inverse function. This is clearly $f^{-1}(x) = \sqrt{x}$. (The $\sqrt{x}$ being defined as the principle square root or the unique *non-negative* answer to $u^2-x=0$.)
@@ -287,7 +287,7 @@ plot(xs, ys; color=:blue, label="f",
plot!(ys, xs; color=:red, label="f⁻¹") # the inverse
```
By flipping around the $x$ and $y$ values in the `plot!` command, we produce the graph of the inverse function--when viewed as a function of $x$. We can see that the domain of the inverse function (in red) is clearly different from that of the function (in blue).
By flipping around the $x$ and $y$ values in the `plot!` command, we produce the graph of the inverse function---when viewed as a function of $x$. We can see that the domain of the inverse function (in red) is clearly different from that of the function (in blue).
The inverse function graph can be viewed as a symmetry of the graph of the function. Flipping the graph for $f(x)$ around the line $y=x$ will produce the graph of the inverse function: Here we see for the graph of $f(x) = x^{1/3}$ and its inverse function:

View File

@@ -11,6 +11,8 @@ using CalculusWithJulia
nothing
```
The [`Julia`](http://www.julialang.org) programming language is well suited as a computer accompaniment while learning the concepts of calculus. The following overview covers the language-specific aspects of the pre-calculus part of the [Calculus with Julia](calculuswithjulia.github.io) notes.
@@ -34,35 +36,29 @@ The [https://mybinder.org/](https://mybinder.org/) service in particular allows
[Google colab](https://colab.research.google.com/) offers a free service with more computing power than `binder`, though setup is a bit more fussy. To use `colab` along with these notes, you need to execute a command that downloads `Julia` and installs the `CalculusWithJulia` package and a plotting package. (Modify the `pkg"add ..."` command to add other desired packages; update the julia version as necessary):
```
# Installation cell
%%capture
%%shell
if ! command -v julia 3>&1 > /dev/null
then
wget -q 'https://julialang-s3.julialang.org/bin/linux/x64/1.10/julia-1.10.2-linux-x86_64.tar.gz' \
-O /tmp/julia.tar.gz
tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
rm /tmp/julia.tar.gz
fi
julia -e 'using Pkg; pkg"add IJulia CalculusWithJulia; precompile;"'
julia -e 'using Pkg; Pkg.add(url="https://github.com/mth229/BinderPlots.jl")'
echo 'Now change the runtime type'
```
(The `BinderPlots` is a light-weight, barebones, plotting package that uses `PlotlyLight` to render graphics with commands mostly following those of the `Plots` package. Though suitable for most examples herein, the `Plots` package could instead be installed)
> Go to google colab:
[https://colab.research.google.com/](https://colab.research.google.com/)
After this executes (which can take quite some time, as in a few minutes) under the `Runtime` menu select `Change runtime type` and then select `Julia`.
> Click on "Runtime" menu and then "Change Runtime Type"
After that, in a cell execute these commands to load the two installed packages:
> Select Julia as the "Runtime Type" then save
> Copy and paste then run this set of commands
```
using Pkg
Pkg.add("Plots")
Pkg.add("CalculusWithJulia")
using CalculusWithJulia
using BinderPlots
using Plots
```
As mentioned, other packages can be chosen for installation.
This may take 2-3 minutes to load. The `plotly()` backend doesn't work out of the box. Use `gr()` to recover if that command is issued.
@@ -85,7 +81,7 @@ $ julia
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.11.1 (2024-10-16)
| | |_| | | | (_| | | Version 1.11.6 (2025-07-09)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
@@ -452,7 +448,7 @@ a = 4
f(3) # now 2 * 4 + 3
```
User-defined functions can have $0$, $1$ or more positional arguments:
User-defined functions can have $0$, $1$, or more positional arguments:
```{julia}
@@ -573,7 +569,7 @@ With `Plots` loaded, we can plot a function by passing the function object by na
```{julia}
plot(sin, 0, 2pi) # plot a function - by name - over an interval [a,b]
plot(sin, 0, 2pi)
```
::: {.callout-note}
@@ -629,7 +625,7 @@ ys = f.(xs)
plot(f, a, b) # recipe for a function
plot(xs, f) # alternate recipe
plot(xs, ys) # plot coordinates as two vectors
plot([(x,f(x)) for x in xs]) # plot a vector o points
plot([(x,f(x)) for x in xs]) # plot a vector of points
```
The choice should depend on convenience.
@@ -637,7 +633,7 @@ The choice should depend on convenience.
## Equations
Notation for `Julia` and math is *similar* for functions - but not for equations. In math, an equation might look like:
Notation for `Julia` and math is *similar* for functions---but not for equations. In math, an equation might look like:
$$
@@ -664,13 +660,15 @@ using SymPy
(A macro rewrites values into other commands before they are interpreted. Macros are prefixed with the `@` sign. In this use, the "macro" `@syms` translates `x a b c` into a command involving `SymPy`s `symbols` function.)
Symbolic expressions - unlike numeric expressions - are not immediately evaluated, though they may be simplified:
Symbolic expressions---unlike numeric expressions---are not immediately evaluated, though they may be simplified:
```{julia}
p = a*x^2 + b*x + c
```
The above command illustrates that the mathematical operations of `*`, `^`, and `+` work with symbolic objects. This is the case for most mathematical functions as well.
To substitute a value, we can use `Julia`'s `pair` notation (`variable=>value`):

View File

@@ -2,13 +2,7 @@ module Make
# makefile for generating typst pdfs
# per directory usage
dir = "precalc"
files = ("calculator",
"variables",
"numbers_types",
"logical_expressions",
"vectors",
"ranges",
"functions",
files = ("functions",
"plotting",
"transformations",
"inversefunctions",

View File

@@ -1,4 +1,4 @@
# The Graph of a Function
# The graph of a function
{{< include ../_common_code.qmd >}}
@@ -254,10 +254,10 @@ nothing
----
---
Making a graph with `Plots` is easy, but producing a graph that is informative can be a challenge, as the choice of a viewing window can make a big difference in what is seen. For example, trying to make a graph of $f(x) = \tan(x)$, as below, will result in a bit of a mess - the chosen viewing window crosses several places where the function blows up:
Making a graph with `Plots` is easy, but producing a graph that is informative can be a challenge, as the choice of a viewing window can make a big difference in what is seen. For example, trying to make a graph of $f(x) = \tan(x)$, as below, will result in a bit of a mess---the chosen viewing window crosses several places where the function blows up:
```{julia}
@@ -536,16 +536,16 @@ The `Plots` package uses positional arguments for input data and keyword argumen
The `Plots` package provides many such arguments for adjusting a graphic, here we mention just a few:
* `plot(...; title="main title", xlab="x axis label", ylab="y axis label")`: add title and label information to a graphic
* `plot(...; color="green")`: this argument can be used to adjust the color of the drawn figure (color can be a string,`"green"`, or a symbol, `:green`, among other specifications)
* `plot(...; linewidth=5)`: this argument can be used to adjust the width of drawn lines
* `plot(...; linestyle=:dash)`: will change the line style of the plotted lines to dashed lines. Also `:dot`, ...
* `plot(...; title="main title", xlabel="x axis label", ylabel="y axis label")`: add title and label information to a graphic
* `plot(...; label="a label")` the `label` attribute will show up when a legend is present. Using an empty string, `""`, will suppress add the layer to the legend.
* `plot(...; legend=false)`: by default, different layers will be indicated with a legend, this will turn off this feature
* `plot(...; xlims=(a,b), ylims=(c,d))`: either or both `xlims` and `ylims` can be used to control the viewing window
* `plot(...; xticks=[xs..], yticks=[ys...]: either or both `xticks` and `yticks` can be used to specify where the tick marks are to be drawn
* `plot(...; aspect_ratio=:equal)`: will keep $x$ and $y$ axis on same scale so that squares look square.
* `plot(...; framestyle=:origin)`: The default `framestyle` places $x$-$y$ guides on the edges; this specification places them on the $x-y$ plane.
* `plot(...; color="green")`: this argument can be used to adjust the color of the drawn figure (color can be a string,`"green"`, or a symbol, `:green`, among other specifications)
* `plot(...; linewidth=5)`: this argument can be used to adjust the width of drawn lines
* `plot(...; linestyle=:dash)`: will change the line style of the plotted lines to dashed lines. Also `:dot`, ...
For plotting points with `scatter`, or `scatter!` the markers can be adjusted via
@@ -583,7 +583,7 @@ With these assumptions, we have an initial decision to make:
We re-express our equation $y=f(x)= mx+b$ in general form $f(x,y) = 0 = Ax + By + C$. Using the other point on the line $A=-(y_1-y_0)$, $B=(x_1-x_0)$, and $C = -x_1y_0 + x_0 y_1$. In particular, by assumption both $A$ and $B$ are positive.
With this, we have $f(x_0,y_0) = 0$. But moreover, any point with $y>y_0$ will have $f(x_0,y)>0$ and if $y < y_0$ the opposite. That is this equation divides the plane into two pieces depending on whether $f$ is positive, the line is the dividing boundary.
With this, we have $f(x_0,y_0) = 0$. But moreover, any point $(x_0,y)$ with $y>y_0$ will have $f(x_0,y)>0$ and if $y < y_0$ the opposite. That is this equation divides the plane into two pieces depending on whether $f$ is positive---the line is the dividing boundary.
For the algorithm, we start at $(x_0, y_0)$ and ask if the pixel $(x_0 + 1, y_0)$ or $(x_0 + 1, y_0 - 1)$ will be lit, then we continue to the right.
@@ -637,7 +637,7 @@ Two basic objects to graph are points and lines. Add to these polygons.
A point in two-dimensional space has two coordinates, often denoted by $(x,y)$. In `Julia`, the same notation produces a `tuple`. Using square brackets, as in `[x,y]`, produces a vector. Vectors are are more commonly used in these notes, as we have seen there are algebraic operations defined for them. However, tuples have other advantages and are how `Plots` designates a point.
The plot command `plot(xs, ys)` plots the points $(x_1,y_1), \dots, (x_n, y_n)$ and then connects adjacent points with with lines. The command `scatter(xs, ys)` just plots the points.
The plot command `plot(xs, ys)` plots the points $(x_1,y_1), \dots, (x_n, y_n)$ and then connects adjacent points with lines. The command `scatter(xs, ys)` just plots the points.
However, the points might be more naturally specified as coordinate pairs. If tuples are used to pair them off, then `Plots` will plot a vector of tuples as a sequence of points through `plot([(x1,y1), (x2, y2), ..., (xn, yn)])`:
@@ -711,7 +711,7 @@ scatter!(f.(θs), g.(θs))
---
As with the plot of a univariate function, there is a convenience interface for these plots - just pass the two functions in:
As with the plot of a univariate function, there is a convenience interface for these plots---just pass the two functions in:
```{julia}
@@ -777,8 +777,9 @@ Playing with the toy makes a few things become clear:
These all apply to parametric plots, as the Etch A Sketch trace is no more than a plot of $(f(t), g(t))$ over some range of values for $t$, where $f$ describes the movement in time of the left knob and $g$ the movement in time of the right.
---
Now, we revisit the last problem in the context of this. We saw in the last problem that the parametric graph was nearly a line - so close the eye can't really tell otherwise. That means that the growth in both $f(t) = t^3$ and $g(t)=t - \sin(t)$ for $t$ around $0$ are in a nearly fixed ratio, as otherwise the graph would have more curve in it.
Now, we revisit the last problem in the context of this. We saw in the last problem that the parametric graph was nearly a line---so close the eye can't really tell otherwise. That means that the growth in both $f(t) = t^3$ and $g(t)=t - \sin(t)$ for $t$ around $0$ are in a nearly fixed ratio, as otherwise the graph would have more curve in it.
##### Example: Spirograph
@@ -1136,14 +1137,3 @@ choices = [
answ = 5
radioq(choices, answ, keep_order=true)
```
---
## Technical note
The slow "time to first plot" in `Julia` is a well-known hiccup that is related to how `Julia` can be so fast. Loading Plots and the making the first plot are both somewhat time consuming, though the second and subsequent plots are speedy. Why?
`Julia` is an interactive language that attains its speed by compiling functions on the fly using the [llvm](llvm.org) compiler. When `Julia` encounters a new combination of a function method and argument types it will compile and cache a function for subsequent speedy execution. The first plot is slow, as there are many internal functions that get compiled. This has sped up of late, as excessive recompilations have been trimmed down, but still has a way to go. This is different from "precompilation" which also helps trim down time for initial executions. There are also some more technically challenging means to create `Julia` images for faster start up that can be pursued if needed.

View File

@@ -614,7 +614,7 @@ It is easy to create a symbolic expression from a function - just evaluate the f
f(x)
```
This is easy--but can also be confusing. The function object is `f`, the expression is `f(x)`--the function evaluated on a symbolic object. Moreover, as seen, the symbolic expression can be evaluated using the same syntax as a function call:
This is easy---but can also be confusing. The function object is `f`, the expression is `f(x)`---the function evaluated on a symbolic object. Moreover, as seen, the symbolic expression can be evaluated using the same syntax as a function call:
```{julia}

View File

@@ -448,12 +448,12 @@ To get the numeric approximation, we can broadcast:
N.(solveset(p ~ 0, x))
```
(There is no need to call `collect` -- though you can -- as broadcasting over a set falls back to broadcasting over the iteration of the set and in this case returns a vector.)
(There is no need to call `collect`---though you can---as broadcasting over a set falls back to broadcasting over the iteration of the set and in this case returns a vector.)
## Do numeric methods matter when you can just graph?
It may seem that certain practices related to roots of polynomials are unnecessary as we could just graph the equation and look for the roots. This feeling is perhaps motivated by the examples given in textbooks to be worked by hand, which necessarily focus on smallish solutions. But, in general, without some sense of where the roots are, an informative graph itself can be hard to produce. That is, technology doesn't displace thinking--it only supplements it.
It may seem that certain practices related to roots of polynomials are unnecessary as we could just graph the equation and look for the roots. This feeling is perhaps motivated by the examples given in textbooks to be worked by hand, which necessarily focus on smallish solutions. But, in general, without some sense of where the roots are, an informative graph itself can be hard to produce. That is, technology doesn't displace thinking---it only supplements it.
For another example, consider the polynomial $(x-20)^5 - (x-20) + 1$. In this form we might think the roots are near $20$. However, were we presented with this polynomial in expanded form: $x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979$, we might be tempted to just graph it to find roots. A naive graph might be to plot over $[-10, 10]$:

View File

@@ -76,7 +76,7 @@ Polynomials may be evaluated using function notation, that is:
p(1)
```
This blurs the distinction between a polynomial expression--a formal object consisting of an indeterminate, coefficients, and the operations of addition, subtraction, multiplication, and non-negative integer powers--and a polynomial function.
This blurs the distinction between a polynomial expression---a formal object consisting of an indeterminate, coefficients, and the operations of addition, subtraction, multiplication, and non-negative integer powers---and a polynomial function.
The polynomial variable, in this case `1x`, can be returned by `variable`:

View File

@@ -22,7 +22,7 @@ nothing
---
Thinking of functions as objects themselves that can be manipulated - rather than just blackboxes for evaluation - is a major abstraction of calculus. The main operations to come: the limit *of a function*, the derivative *of a function*, and the integral *of a function* all operate on functions. Hence the idea of an [operator](http://tinyurl.com/n5gp6mf). Here we discuss manipulations of functions from pre-calculus that have proven to be useful abstractions.
Thinking of functions as objects themselves that can be manipulated---rather than just blackboxes for evaluation---is a major abstraction of calculus. The main operations to come: the limit *of a function*, the derivative *of a function*, and the integral *of a function* all operate on functions. Hence the idea of an [operator](http://tinyurl.com/n5gp6mf). Here we discuss manipulations of functions from pre-calculus that have proven to be useful abstractions.
## The algebra of functions
@@ -141,7 +141,7 @@ The real value of composition is to break down more complicated things into a se
### Shifting and scaling graphs
It is very useful to mentally categorize functions within families. The difference between $f(x) = \cos(x)$ and $g(x) = 12\cos(2(x - \pi/4))$ is not that much - both are cosine functions, one is just a simple enough transformation of the other. As such, we expect bounded, oscillatory behaviour with the details of how large and how fast the oscillations are to depend on the specifics of the function. Similarly, both these functions $f(x) = 2^x$ and $g(x)=e^x$ behave like exponential growth, the difference being only in the rate of growth. There are families of functions that are qualitatively similar, but quantitatively different, linked together by a few basic transformations.
It is very useful to mentally categorize functions within families. The difference between $f(x) = \cos(x)$ and $g(x) = 12\cos(2(x - \pi/4))$ is not that much---both are cosine functions, one is just a simple enough transformation of the other. As such, we expect bounded, oscillatory behaviour with the details of how large and how fast the oscillations are to depend on the specifics of the function. Similarly, both these functions $f(x) = 2^x$ and $g(x)=e^x$ behave like exponential growth, the difference being only in the rate of growth. There are families of functions that are qualitatively similar, but quantitatively different, linked together by a few basic transformations.
There is a set of operations of functions, which does not really change the type of function. Rather, it basically moves and stretches how the functions are graphed. We discuss these four main transformations of $f$:
@@ -322,7 +322,7 @@ datetime = 12 + 10/60 + 38/60/60
delta = (newyork(266) - datetime) * 60
```
This is off by a fair amount - almost $8$ minutes. Clearly a trigonometric model, based on the assumption of circular motion of the earth around the sun, is not accurate enough for precise work, but it does help one understand how summer days are longer than winter days and how the length of a day changes fastest at the spring and fall equinoxes.
This is off by a fair amount---almost $8$ minutes. Clearly a trigonometric model, based on the assumption of circular motion of the earth around the sun, is not accurate enough for precise work, but it does help one understand how summer days are longer than winter days and how the length of a day changes fastest at the spring and fall equinoxes.
##### Example: the pipeline operator
@@ -358,7 +358,7 @@ Suppose we have a data set like the following:^[Which comes from the "Palmer Pen
| 48.8 | 18.4 | 3733 | male | Chinstrap |
| 47.5 | 15.0 | 5076 | male | Gentoo |
We might want to plot on an $x-y$ axis flipper length versus bill length but also indicate body size with a large size marker for bigger sizes.
We might want to plot on an $x$-$y$ axis flipper length versus bill length but also indicate body size with a large size marker for bigger sizes.
We could do so by transforming a marker: scaling by size, then shifting it to an `x-y` position; then plotting. Something like this:
@@ -473,7 +473,7 @@ S(D(f))(15), f(15) - f(0)
That is the accumulation of differences is just the difference of the end values.
These two operations are discrete versions of the two main operations of calculus - the derivative and the integral. This relationship will be known as the "fundamental theorem of calculus."
These two operations are discrete versions of the two main operations of calculus---the derivative and the integral. This relationship will be known as the "fundamental theorem of calculus."
## Questions

View File

@@ -269,7 +269,7 @@ plot(sin, 0, 4pi)
The graph shows two periods. The wavy aspect of the graph is why this function is used to model periodic motions, such as the amount of sunlight in a day, or the alternating current powering a computer.
From this graph - or considering when the $y$ coordinate is $0$ - we see that the sine function has zeros at any integer multiple of $\pi$, or $k\pi$, $k$ in $\dots,-2,-1, 0, 1, 2, \dots$.
From this graph---or considering when the $y$ coordinate is $0$---we see that the sine function has zeros at any integer multiple of $\pi$, or $k\pi$, $k$ in $\dots,-2,-1, 0, 1, 2, \dots$.
The cosine function is similar, in that it has the same domain and range, but is "out of phase" with the sine curve. A graph of both shows the two are related:
@@ -693,14 +693,144 @@ atan(y, x)
##### Example
A (white) light shining through a [prism](http://tinyurl.com/y8sczg4t) will be deflected depending on the material of the prism and the angles involved (refer to the link for a figure). The relationship can be analyzed by tracing a ray through the figure and utilizing Snell's law. If the prism has index of refraction $n$ then the ray will deflect by an amount $\delta$ that depends on the angle, $\alpha$ of the prism and the initial angle ($\theta_0$) according to:
A (white) light shining through a [dispersive prism](https://en.wikipedia.org/wiki/Dispersive_prism) will be deflected depending on the material of the prism and the angles involved. The relationship can be analyzed by tracing a ray through the figure and utilizing Snell's law which relates the angle of incidence with the angle of refraction through two different media through:
$$
\delta = \theta_0 - \alpha + \arcsin(n \sin(\alpha - \arcsin(\frac{1}{n}\sin(\theta_0)))).
n_0 \sin(\theta_0) = n_1 \sin(\theta_1)
$$
If $n=1.5$ (glass), $\alpha = \pi/3$ and $\theta_0=\pi/6$, find the deflection (in radians).
:::{#fig-snells-law-prism}
```{julia}
#| echo: false
p1 = let
gr()
plot(; empty_style..., aspect_ratio=:equal)
n₀,n₁,n₂ = 1,3, 1
θ₀ = pi/7
α = pi/8
θ₁ = asin(n₀/n₁* θ₀)
θ₁′ = α - θ₁
θ₂′ = asin(n₁/n₂ * θ₁′)
θ₂ = θ₂′ - α
plot!([(-1,0), (1,0)]; line=(:black, 1))
plot!([(0,-1),(0,1)]; line=(:black, 1))
plot!([(0,1), (2tan(α),-1)]; line=(:black, 1))
S = Shape([(0,-1),(2tan(α),-1),(0,1)])
plot!(S, fill=(:gray80, 0.25), line=nothing)
xx = tan(α)/ (1 + tan(α)*tan(θ₁))
sl(x) = 1 - x/tan(α)
yy = sl(xx)
plot!(sl, xx-1/8, xx+1/9; line=(:red,3))
plot!(x -> yy + tan(α)*(x-xx), xx-1/4, xx+1/4; line=(:red, 3))
plot!([(-1,-sin(θ₀)), (0,0)]; line=(:black, 2))
plot!([(0,0), (xx, xx*tan(θ₁))]; line=(:black, 2))
plot!([(xx,yy), (xx + 5/8, yy - 5/8*tan(θ₂))]; line=(:black, 2))
annotate!([
(-1/2,1/2*sin(pi + θ₀/2), text(L"\theta_0")),
(1/5, 1/5*sin(θ₁/2), text(L"\theta_1")),
(2tan(α), -0.075, text(L"\theta_2")),
(-1/2, -3/4, text(L"n_0")),
(2tan(α/2), -3/4, text(L"n_1")),
(1 - (1-2tan(α))/2, -3/4, text(L"n_2"))
])
current()
end
p2 = let
plot(; empty_style..., aspect_ratio=:equal)
n₀,n₁,n₂ = 1,3, 1
θ₀ = pi/7
α = pi/8
θ₁ = asin(n₀/n₁* θ₀)
θ₁′ = α - θ₁
θ₂′ = asin(n₁/n₂ * θ₁′)
θ₂ = θ₂′ - α
xx = tan(α)/ (1 + tan(α)*tan(θ₁))
sl(x) = 1 - x/tan(α)
yy = sl(xx)
plot!(sl, xx-1/8, xx+1/9; line=(:red,3))
plot!(x -> yy + tan(α)*(x-xx), xx-1/4, xx+1/4; line=(:red, 3))
S = Shape([(0, sl(xx-1/8)), (xx-1/8,sl(xx-1/8)),
(xx+1/9, sl(xx+1/9)), (0, sl(xx+1/9))])
plot!(S, fill=(:gray80, 0.25), line=nothing)
#plot!([(-1,-sin(θ₀)), (0,0)]; line=(:black, 2))
plot!([(0,0), (xx, xx*tan(θ₁))]; line=(:black, 2))
plot!([(xx,yy), (xx + 2/8, yy - 2/8*tan(θ₂))]; line=(:black, 2))
annotate!([
(1/5, .1*sin(θ₁/2), text(L"\theta_1\prime")),
(xx + .1, 0.06, text(L"\theta_2\prime")),
(2tan(α/2), -1/8, text(L"n_1")),
(17/32, -1/8, text(L"n_2"))
])
current()
end
plot(p1, p2; layout=(1,2))
```
Light bending through a prism. The right graphic shows the second bending.
:::
```{julia}
#| echo: false
plotly()
nothing
```
Following Wikipedia, we have
$$
\theta_1 = \sin^{-1}\left( \frac{n_0}{n_1} \sin(\theta_0) \right)
$$
Both $\theta_0$ and $\theta_1$ are measured with respect to the coordinate system that looks like the $x-y$ plane. The red coordinate system is used to identify the angle of incidence for the second bending. Some right-triangle geometry relates the new angle $\theta'_1$ with $\theta_1$ through $\theta'_1 = \alpha - \theta_1$. With this new angle of incidence, the angle of refraction, $\theta'_2$, satisfies:
$$
n1 \sin(\theta'_1) = n2 \sin(\theta'_2)
$$
Or
$$
\theta'_2 = \sin^{-1}\left(\frac{n_1}{n_2}\sin(\theta'_1) \right)
$$
Finally, using right-triangle geometry, the angle $\theta_2 = \theta'_2 - \alpha$ can be identified.
For a prism, in air, we would have $n_0 = n_2 = 1$. Letting $n_1 = n$, and combining we get
$$
\begin{align*}
\delta &= \theta_0 + \theta_2\\
&=\theta_0 + \sin^{-1}\left(\frac{n_1}{n_2}\sin(\theta'_1) \right)- \alpha\\
&= \theta_0 - \alpha + \sin^{-1}\left(\frac{n}{1}\sin(\alpha -\theta_1) \right)\\
&= \theta_0 - \alpha + \sin^{-1}\left(n\sin\left(\alpha - \sin^{-1}\left( \frac{n_0}{n_1} \sin(\theta_0) \right)\right)\right)\\
&= \theta_0 - \alpha + \sin^{-1}\left(n\sin\left(\alpha -\sin^{-1}\left( \frac{1}{n} \sin(\theta_0) \right)\right) \right)
\end{align*}
$$
If the prism has index of refraction $n$ then the ray will deviate by this amount $\delta$ that depends on the initial incidence angle, $\alpha$ of the prism and $n$.
When $n=1.5$ (glass), $\alpha = \pi/3$ and $\theta_0=\pi/6$, find the deflection (in radians).
We have:
@@ -795,6 +925,8 @@ plot(abs ∘ T4, -1,1, label="|T₄|")
plot!(abs ∘ q, -1,1, label="|q|")
```
We will return to this family of polynomials in the section on Orthogonal Polynomials.
## Hyperbolic trigonometric functions